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Preface 



To the outside world, insurance mathematics does not appear as a challeng- 
ing topic. In fact, everyone has to deal with matters of insurance at various 
times of one’s life. Hence this is quite an interesting perception of a field 
which constitutes one of the bases of modern society. There is no doubt that 
modern economies and states would not function without institutions which 
guarantee reimbursement to the individual, the company or the organization 
for its losses, which may occur due to natural or man-made catastrophes, 
fires, floods, accidents, riots, etc. The idea of insurance is part of our civilized 
world. It is based on the mutual trust of the insurer and the insured. 

It was realized early on that this mutual trust must be based on science, 
not on belief and speculation. In the 20tlr century the necessary tools for 
dealing with matters of insurance were developed. These consist of probabil- 
ity theory, statistics and stochastic processes. The Swedish mathematicians 
Filip Lundberg and Harald Cramer were pioneers in these areas. They realized 
in the first half of the 20th century that the theory of stochastic processes pro- 
vides the most appropriate framework for modeling the claims arriving in an 
insurance business. Nowadays, the Cramer-Lundberg model is one of the back- 
bones of non-life insurance mathematics. It has been modified and extended 
in very different directions and, morever, has motivated research in various 
other fields of applied probability theory, such as queuing theory, branching 
processes, renewal theory, reliability, dam and storage models, extreme value 
theory, and stochastic networks. 

The aim of this book is to bring some of the standard stochastic models 
of non-life insurance mathematics to the attention of a wide audience which, 
hopefully, will include actuaries and also other applied scientists. The primary 
objective of this book is to provide the undergraduate actuarial student with 
an introduction to non-life insurance mathematics. I used parts of this text in 
the course on basic non-life insurance for 3rd year mathematics students at the 
Laboratory of Actuarial Mathematics of the University of Copenhagen. But 
I am convinced that the content of this book will also be of interest to others 
who have a background on probability theory and stochastic processes and 
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would like to learn about applied stochastic processes. Insurance mathematics 
is a part of applied probability theory. Moreover, its mathematical tools are 
also used in other applied areas (usually under different names). 

The idea of writing this book came in the spring of 2002, when I taught 
basic non-life insurance mathematics at the University of Copenhagen. My 
handwritten notes were not very much appreciated by the students, and so I 
decided to come up with some lecture notes for the next course given in spring, 
2003. This book is an extended version of those notes and the associated 
weekly exercises. I have also added quite a few computer graphics to the 
text. Graphs help one to understand and digest the theory much easier than 
formulae and proofs. In particular, computer simulations illustrate where the 
limits of the theory actually are. 

When one writes a book, one uses the experience and knowledge of gener- 
ations of mathematicians without being directly aware of it. Ole Hesselager’s 
1998 notes and exercises for the basic course on non-life insurance at the 
Laboratory of Actuarial Mathematics in Copenhagen were a guideline to the 
content of this book. I also benefitted from the collective experience of writing 
EKM [29] . The knowledgeable reader will see a few parallels between the two 
books. However, this book is an introduction to non-life insurance, whereas 
EKM assume that the reader is familiar with the basics of this theory and 
also explores various other topics of applied probability theory. After having 
read this book, the reader will be ready for EKM. Another influence has been 
Sid Resnick’s enjoyable book about Happy Harry [65]. I admit that some of 
the mathematical taste of that book has infected mine; the interested reader 
will find a wealth of applied stochastic process theory in [65] which goes far 
beyond the scope of this book. 

The choice of topics presented in this book has been dictated, on the one 
hand, by personal taste and, on the other hand, by some practical considera- 
tions. This course is the basis for other courses in the curriculum of the Danish 
actuarial education and therefore it has to cover a certain variety of topics. 
This education is in agreement with the Groupe Consultatif requirements, 
which are valid in most European countries. 

As regards personal taste, I very much focused on methods and ideas 
which, in one way or other, are related to renewal theory and point processes. 
I am in favor of methods where one can see the underlying probabilistic struc- 
ture without big machinery or analytical tools. This helps one to strengthen 
intuition. Analytical tools are like modern cars, whose functioning one can- 
not understand; one only finds out when they break down. Martingale and 
Markov process theory do not play an important role in this text. They are 
acting somewhere in the background and are not especially emphasized, since 
it is the author’s opinion that they are not really needed for an introduction 
to non-life insurance mathematics. Clearly, one has to pay a price for this 
approach: lack of elegance in some proofs, but with elegance it is very much 
like with modern cars. 




Preface VII 



According to the maxim that non-Bayesians have more fun, Bayesian ideas 
do not play a major role in this text. Part II on experience rating is therefore 
rather short, but self-contained. Its inclusion is caused by the practical reasons 
mentioned above but it also pays respect to the influential contributions of 
Hans Buhlmann to modern insurance mathematics. 

Some readers might miss a chapter on the interplay of insurance and fi- 
nance, which has been an open subject of discussion for many years. There 
is no doubt that the modern actuary should be educated in modern finan- 
cial mathematics, but that requires stochastic calculus and continuous-time 
martingale theory, which is far beyond the scope of this book. There exists a 
vast specialized literature on financial mathematics. This theory has dictated 
most of the research on financial products in insurance. To the best of the au- 
thor’s knowledge, there is no part of insurance mathematics which deals with 
the pricing and hedging of insurance products by techniques and approaches 
genuinely different from those of financial mathematics. 

It is a pleasure to thank my colleagues and students at the Laboratory 
of Actuarial Mathematics in Copenhagen for their support. Special thanks 
go to Jeffrey Collamore, who read much of this text and suggested numerous 
improvements upon my German way of writing English. I am indebted to 
Catriona Byrne from Springer- Verlag for professional editorial help. 

If this book helps to change the perception that non-life insurance math- 
ematics has nothing to offer but boring calculations, its author has achieved 
his objective. 



Thomas Mikosch 



Copenhagen, September 2003 



Acknowledgment. This reprinted edition contains a large number of correc- 
tions scattered throughout the text. I am indebted to Uwe Schmock, Remigijus 
Leipus, Vicky Fasen and Anders Hedegaard Jessen, who have made sugges- 
tions for improvements and corrections. 



Thomas Mikosch 



Copenhagen, February 2006 
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Guidelines to the Reader 



This book grew out of an introductory course on non-life insurance, which I 
taught several times at the Laboratory of Actuarial Mathematics of the Uni- 
versity of Copenhagen. This course was given at the third year of the actuarial 
studies which, together with an introductory course on life insurance, courses 
on law and accounting, and bachelor projects on life and non-life insurance, 
leads to the Bachelor’s degree in Actuarial Mathematics. This programme has 
been successfully composed and applied in the 1990s by Ragnar Norberg and 
his colleagues. In particular, I have benefitted from the notes and exercises of 
Ole Hesselager which, in a sense, formed the first step to the construction of 
this book. 

When giving a course for the first time, one is usually faced with the sit- 
uation that one looks for appropriate teaching material: one browses through 
the available literature (which is vast in the case of non-life insurance), and 
soon one realizes that the available texts do not exactly suit one’s needs for 
the course. 



What are the prerequisites for this book? 



Since the students of the Laboratory of Actuarial Mathematics in Copen- 
hagen have quite a good background in measure theory, probability theory 
and stochastic processes, it is natural to build a course on non- life insurance 
based on knowledge of these theories. In particular, the theory of stochastic 
processes and applied probability theory (which insurance mathematics is a 
part of) have made significant progress over the last 50 years, and therefore 
it seems appropriate to use these tools even in an introductory course. 

On the other hand, the level of this course is not too advanced. For exam- 
ple, martingale and Markov process theory are avoided as much as possible 
and so are many analytical tools such as Laplace-Stieltjes transforms; these 
notions only appear in the exercises or footnotes. Instead I focused on a more 
intuitive probabilistic understanding of the risk and total claim amount pro- 
cesses and their underlying random walk structure. A random walk is one of 
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the simplest stochastic processes and allows in many cases for explicit cal- 
culations of distributions and their characteristics. If one goes this way, one 
essentially walks along the path of renewal and point process theory. However, 
renewal theory will not be stressed too much, and only some of the essential 
tools such as the key renewal theorem will be explained at an informal level. 
Point process theory will be used indirectly at many places, in particular, in 
the section on the Poisson process, but also in this case the discussion will not 
go too far; the notion of a random measure will be mentioned but not really 
needed for the understanding of the succeeding sections and chapters. 

Summarizing the above, the reader of this book should have a good back- 
ground in probability and measure theory and in stochastic processes. Measure 
theoretic arguments can sometimes be replaced by intuitive arguments, but 
measure theory will make it easier to get through the chapters of this book. 



For whom is this book written? 



The book is primarily written for the undergraduate student who wants to 
learn about some fundamental results in non-life insurance mathematics by 
using the theory of stochastic processes. One of the differences from other texts 
of this kind is that I have tried to express most of the theory in the language of 
stochastic processes. As a matter of fact, Filip Lundberg and Harald Cramer 
— two pioneers in actuarial mathematics — have worked in exactly this spirit: 
the insurance business in its parts is described as a continuous-time stochastic 
process. This gives a more complex view of insurance mathematics and allows 
one to apply recent results from the theory of stochastic processes. 

A widespread opinion about insurance mathematics (at least among math- 
ematicians) is that it is a rather dry and boring topic since one only calculates 
moments and does not really have any interesting structures. One of the aims 
of this book is to show that one should not take this opinion at face value and 
that it is enjoyable to work with the structures of non-life insurance mathe- 
matics. Therefore the present text can be interesting also for those who do 
not necessarily wish to spend the rest of their lives in an insurance company. 
The reader of this book could be a student in any field of applied mathemat- 
ics or statistics, a physicist or an engineer who wants to learn about applied 
stochastic models such as the Poisson, compound Poisson and renewal pro- 
cesses. These processes lie at the heart of this book and are fundamental 
in many other areas of applied probability theory, such as renewal theory, 
queuing, stochastic networks, and point process theory. The chapters of this 
book touch on more general topics than insurance mathematics. The inter- 
ested reader will find discussions about more advanced topics, with a list of 
relevant references, showing that insurance mathematics is not a closed world 
but open to other fields of applied probability theory, stochastic processes and 
statistics. 

How should you read this book? 
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Part I deals with collective risk models, i.e., models which describe the evo- 
lution of an insurance portfolio as a mechanism, where claims and premiums 
have to be balanced in order to avoid ruin. Part II studies the individual poli- 
cies and gives advice about how much premium should be charged depending 
on the policy experience represented by the claim data. There is little theo- 
retical overlap of these two parts; the models and the mathematical tools are 
completely different. 

The core material (and the more interesting one from the author’s point 
of view, since it uses genuine stochastic process theory) is contained in Part I. 
It is built up in an hierarchical way. You cannot start with Chapter 4 on ruin 
theory without having understood Chapter 2 on claim number processes. 

Chapter 1 introduces the basic model of collective risk theory, combining 
claim sizes and claim arrival times. The claim number process, i.e., the count- 
ing process of the claim arrival times, is one of the main objects of interest 
in this book. It is dealt with in Chapter 2, where three major claim number 
processes are introduced: the Poisson process (Section 2.1), the renewal pro- 
cess (Section 2.2) and the mixed Poisson process (Section 2.3). Most of the 
material of these sections is relevant for the understanding of the remaining 
sections. However, some of the sections contain informal discussions (for ex- 
ample, about the generalized Poisson process or renewal theory), which can 
be skipped on first reading; only a few facts of those sections will be used 
later. The discussions at an informal level are meant as appetizers to make 
the reader curious and to invite him/her to learn about more advanced prob- 
abilistic structures. 

Chapter 3 studies the total claim amount process, i.e., the process of the 
aggregated claim sizes in the portfolio as a function of time. The order of mag- 
nitude of this object is of main interest, since it tells one how much premium 
should be charged in order to avoid ruin. Section 3.1 gives some quantita- 
tive measures for the order of magnitude of the total claim amount. Realistic 
claim size distributions are discussed in Section 3.2. In particular, we stress 
the notion of heavy-tailed distribution, which lies at the heart of (re)insurance 
and addresses how large claims or the largest claim can be modeled in an 
appropriate way. Over the last 30 years we have experienced major man- 
made and natural catastrophes; see Table 3.2.18, where the largest insurance 
losses are reported. They challenge the insurance industry, but they also call 
for improved mathematical modeling. In Section 3.2 we further discuss some 
exploratory statistical tools and illustrate them with real-life and simulated 
insurance data. Much of the material of this section is informal and the inter- 
ested reader is again referred to more advanced literature which might give 
answers to the questions which arose in the process of reading. In Section 3.3 
we touch upon the problem of how one can calculate or approximate the 
distribution of the total claim amount. Since this is a difficult and complex 
matter we cannot come up with complete solutions. We rather focus on one 
of the numerical methods for calculating this distribution, and then we give 
informal discussions of methods which are based on approximations or simu- 




4 



Guidelines to the Reader 



lations. These are quite specific topics and therefore their space is limited in 
this book. The final Section 3.4 on reinsurance treaties introduces basic no- 
tions of the reinsurance language and discusses their relation to the previously 
developed theory. 

Chapter 4 deals with one of the highlights of non-life insurance mathemat- 
ics: the probability of ruin of a portfolio. Since the early work by Lundberg 
[55] and Cramer [23], this part has been considered a jewel of the theory. It is 
rather demanding from a mathematical point of view. On the other hand, the 
reader learns how various useful concepts of applied probability theory (such 
as renewal theory, Laplace-Stieltjes transforms, integral equations) enter to 
solve this complicated problem. Section 4.1 gives a gentle introduction to the 
topic “ruin”. The famous results of Lundberg and Cramer on the order of 
magnitude of the ruin probability are formulated and proved in Section 4.2. 
The Cramer result, in particular, is perhaps the most challenging mathemat- 
ical result of this book. We prove it in detail; only at a few spots do we need 
to borrow some more advanced tools from renewal theory. Cramer’s theorem 
deals with ruin for the small claim case. We also prove the corresponding 
result for the large claim case, where one very large claim can cause ruin 
spontaneously. 

As mentioned above, Part II deals with models for the individual policies. 
Chapters 5 and 6 give a brief introduction to experience rating: how much 
premium should be charged for a policy based on the claim history? In these 
two chapters we introduce three major models (heterogeneity, Biihlmann, 
Biihlmann-Straub) in order to describe the dependence of the claim struc- 
ture inside a policy and across the policies. Based on these models, we discuss 
classical methods in order to determine a premium for a policy by taking 
into account the claim history and the overall portfolio experience (credibility 
theory). Experience rating and credibility theory are classical and influen- 
tial parts of non-life insurance mathematics. They do not require genuine 
techniques from stochastic process theory, but they are nevertheless quite de- 
manding: the proofs are quite technical. 

It is recommended that the reader who wishes to be successful should 
solve the exercises, which are collected at the end of each section; they are an 
integral part of this course. Moreover, some of the proofs in the sections are 
only sketched and the reader is recommended to complete them. The exercises 
also give some guidance to the solution of these problems. 

At the end of this book you will know about the fundamental models 
of non-life insurance mathematics and about applied stochastic processes. 
Then you may want to know more about stochastic processes in general and 
insurance models in particular. At the end of the sections and sometimes at 
suitable spots in the text you will find references to more advanced literature. 
They can be useful for the continuation of your studies. 

You are now ready to start. Good luck! 
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Collective Risk Models 





1 



The Basic Model 



In 1903 the Swedish actuary Filip Lunclberg [55] laid the foundations of mod- 
ern risk theory. Risk theory is a synonym for non-life insurance mathematics, 
which deals with the modeling of claims that arrive in an insurance business 
and which gives advice on how much premium has to be charged in order to 
avoid bankruptcy (ruin) of the insurance company. 

One of Lundberg’s main contributions is the introduction of a simple model 
which is capable of describing the basic dynamics of a homogeneous insurance 
portfolio. By this we mean a portfolio of contracts or policies for similar risks 
such as car insurance for a particular kind of car, insurance against theft in 
households or insurance against water damage of one-family homes. 

There are three assumptions in the model: 

• Claims happen at the times T) satisfying 0 < Tf < < • • • . We call them 

claim arrivals or claim times or claim arrival times or, simply, arrivals. 

• The itli claim arriving at time T,; causes the claim size or claim severity 
X, . The sequence (Xj) constitutes an iid sequence of non-negative random 
variables. 

• The claim size process ( W) and the claim arrival process (Ti) are mutually 
independent. 

The iid property of the claim sizes, X,, reflects the fact that there is a ho- 
mogeneous probabilistic structure in the portfolio. The assumption that claim 
sizes and claim times be independent is very natural from an intuitive point 
of view. But the independence of claim sizes and claim arrivals also makes 
the life of the mathematician much easier, i.e., this assumption is made for 
mathematical convenience and tractability of the model. 

Now we can define the claim number process 

N(t) = #{i > 1 : Ti < t} , t > 0 , 

i.e., N = (N(t)) t > o is a counting process on [0,oo): N(t) is the number of the 
claims which occurred by time t. 
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The object of main interest from the point of view of an insurance company 
is the total claim amount process or aggregate claim amount process: 1 

N(t) oo 

S(t) = Y, Xi = Y, J [0,t] ( T i ) . t > 0 • 

2 = 1 2=1 

The process S = (S(t))t> o is a random partial sum process which refers to the 
fact that the deterministic index n of the partial sums S n = X\ + ■ ■ ■ + X n is 
replaced by the random variables N(t ): 



S(t) — X ! + ••• + Xjv(t) — 'S'jv(t) > t > 0 . 

It is also often called a compound (sum) process. We will observe that the 
total claim amount process S shares various properties with the partial sum 
process. For example, asymptotic properties such as the central limit theorem 
and the strong law of large numbers are analogous for the two processes; see 
Section 3.1.2. 

In Figure 1.0.1 we see a sample path of the process N and the correspond- 
ing sample path of the compound sum process S. Both paths jump at the 
same times Tp by 1 for N and by X t for S. 





t t 

Figure 1.0.1 A sample path of the claim arrival process N (left) and of the cor- 
responding total claim amount process S (right). Mind the difference of the jump 
sizes! 



One would like to solve the following problems by means of insurance 
mathematical methods: 

1 Here and in what follows, 5^i=i = 0 for any real ai and I a is the indicator 

function of any set A: Ia(x) = 1 if x £ A and Ia(x) = 0 if x £ A. 
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• Find sufficiently realistic, but simple, 2 probabilistic models for S and N. 
This means that we have to specify the distribution of the claim sizes X, 
and to introduce models for the claim arrival times Tj. The discrepancy be- 
tween “realistic” and “simple” models is closely related to the question to 
which extent a mathematical model can describe the complicated dynam- 
ics of an insurance portfolio without being mathematically intractable. 

• Determine the theoretical properties of the stochastic processes S and N. 
Among other things, we are interested in the distributions of S and N, 
their distributional characteristics such as the moments, the variance and 
the dependence structure. We will study the asymptotic behavior of N(t) 
and S(t) for large t and the average behavior of N and S in the interval 
[0 , t] . To be more specific, we will give conditions under which the strong 
law of large numbers and the central limit theorem hold for S and N. 

• Give simulation procedures for the processes N and S. Simulation methods 
have become more and more popular over the last few years. In many 
cases they have replaced rigorous probabilistic and/or statistical methods. 
The increasing power of modern computers allows one to simulate various 
scenarios of possible situations an insurance business might have to face 
in the future. This does not mean that no theory is needed any more. On 
the contrary, simulation generally must be based on probabilistic models 
for N and 5; the simulation procedure itself must exploit the theoretical 
properties of the processes to be simulated. 

• Based on the theoretical properties of N and S, give advice how to choose 
a premium in order to cover the claims in the portfolio, how to build 
reserves, how to price insurance products, etc. 

Although statistical inference on the processes S and N is utterly important 
for the insurance business, we do not address this aspect in a rigorous way. The 
statistical analysis of insurance data is not different from standard statistical 
methods which have been developed for iid data and for counting processes. 
Whereas there exist numerous monographs dealing with the inference of iid 
data, books on the inference of counting processes are perhaps less known. 
We refer to the book by Andersen et al. [2] for a comprehensive treatment. 

We start with the extensive Chapter 2 on the modeling of the claim number 
process N. The process of main interest is the Poisson process. It is treated in 
Section 2.1. The Poisson process has various attractive theoretical properties 
which have been collected for several decades. Therefore it is not surprising 
that it made its way into insurance mathematics from the very beginning, 
starting with Lundberg’s thesis [55] . Although the Poisson process is perhaps 
not the most realistic process when it comes to fitting real-life claim arrival 
times, it is kind of a benchmark process. Other models for N are modifications 
of the Poisson process which yield greater flexibility in one way or the other. 

2 This requirement is in agreement with Einstein’s maxim “as simple as possible, 
but not simpler”. 
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This concerns the renewal process which is considered in Section 2.2. It 
allows for more flexibility in choosing the distribution of the inter-arrival times 
Ti — Ti- 1 - But one has to pay a price: in contrast to the Poisson process when 
N(t) has a Poisson distribution for every t, this property is in general not 
valid for a renewal process. Moreover, the distribution of N(t) is in general 
not known. Nevertheless, the study of the renewal process has led to a strong 
mathematical theory, the so-called renewal theory, which allows one to make 
quite precise statements about the expected claim number EN(t) for large 
t. We sketch renewal theory in Section 2.2.2 and explain what its purpose is 
without giving all mathematical details, which would be beyond the scope of 
this text. We will see in Section 4.2.2 on ruin probabilities that the so-called 
renewal equation is a very powerful tool which gives us a hand on measuring 
the probability of ruin in an insurance portfolio. A third model for the claim 
number process N is considered in Section 2.3: the mixed Poisson process. 
It is another modification of the Poisson process. By randomization of the 
parameters of a Poisson process (“mixing”) one obtains a class of processes 
which exhibit a much larger variety of sample paths than for the Poisson or 
the renewal processes. We will see that the mixed Poisson process has some 
distributional properties which completely differ from the Poisson process. 

After the extensive study of the claim number process we focus in Chap- 
ter 3 on the theoretical properties of the total claim amount process 5. We 
start in Section 3.1 with a description of the order of magnitude of S(t). Re- 
sults include the mean and the variance of S(t) (Section 3.1.1) and asymptotic 
properties such as the strong law of large numbers and the central limit the- 
orem for S(t) as t — » oo (Section 3.1.2). We also discuss classical premium 
calculation principles (Section 3.1.3) which are rules of thumb for how large 
the premium in a portfolio should be in order to avoid ruin. These principles 
are consequences of the theoretical results on the growth of 5(f) for large t. 
In Section 3.2 we hint at realistic claim size distributions. In particular, we 
focus on heavy-tailed claim size distributions and study some of their theoret- 
ical properties. Distributions with regularly varying tails and subexponential 
distributions are introduced as the natural classes of distributions which are 
capable of describing large claim sizes. Section 3.3 continues with a study of 
the distributional characteristics of 5(f). We show some nice closure proper- 
ties which certain total claim amount models ( “mixture distributions” ) obey; 
see Section 3.3.1. We also show the surprising result that a disjoint decompo- 
sition of time and/or claim size space yields independent total claim amounts 
on the different pieces of the partition; see Section 3.3.2. Then various ex- 
act (numerical; see Section 3.3.3) and approximate (Monte Carlo, bootstrap, 
central limit theorem based; see Section 3.3.4) methods for determining the 
distribution of 5(f), their advantages and drawbacks are discussed. Finally, in 
Section 3.4 we give an introduction to reinsurance treaties and show the link 
to previous theory. 

A major building block of classical risk theory is devoted to the probability 
of ruin ; see Chapter 4. It is a global measure of the risk one encounters in a 
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portfolio over a long time horizon. We deal with the classical small claim case 
and give the celebrated estimates of Cramer and Lundberg (Sections 4.2.1 and 
4.2.2). These results basically say that ruin is very unlikely for small claim 
sizes. In contrast to the latter results, the large claim case yields completely 
different results: ruin is not unlikely; see Section 4.2.4. 




2 



Models for the Claim Number Process 



2.1 The Poisson Process 

In this section we consider the most common claim number process: the Pois- 
son process. It has very desirable theoretical properties. For example, one can 
derive its finite-dimensional distributions explicitly. The Poisson process has a 
long tradition in applied probability and stochastic process theory. In his 1903 
thesis, Filip Lunclberg already exploited it as a model for the claim number 
process N. Later on in the 1930s, Harald Cramer, the famous Swedish statis- 
tician and probabilist, extensively developed collective risk theory by using 
the total claim amount process S with arrivals T,; which are generated by a 
Poisson process. For historical reasons, but also since it has very attractive 
mathematical properties, the Poisson process plays a central role in insurance 
mathematics. 

Below we will give a definition of the Poisson process, and for this purpose 
we now introduce some notation. For any real-valued function / on [0,oo) we 
write 



f(s,t] = f(t) - f(s) , 0 < s < f < oo . 

Recall that an integer-valued random variable M is said to have a Poisson 
distribution with parameter A > 0 ( M ~ Pois(A)) if it has distribution 

\k 

P(M = k) = e — , k = 0,1,.... 
k\ 

We say that the random variable M = 0 a.s. has a Pois(O) distribution. Now 
we are ready to define the Poisson process. 

Definition 2.1.1 (Poisson process) 

A stochastic process N = (N(t))t> o is said to be a Poisson process if the 
following conditions hold: 

(1) The process starts at zero: N{ 0) = 0 a.s. 
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(2) The process has independent increments: for any ti, i = 0 , ...,n, and 
n > 1 such that 0 = to < t,± <■■■< t n , the increments 

i = 1, . . . ,n, are mutually independent. 

(3) There exists a non- decreasing right- continuous function p : [0, oo) — » 
[0,oo) with p(0) = 0 such that the increments N(s,t] for 0 < s < t < oo 
have a Poisson distribution Pois(/z(s,i]). We call p the mean value func- 
tion of N. 

(4) With probability 1, the sample paths (N(t,co))t> o of the process N are 
right- continuous for t > 0 and have limits from the left for t > 0. We say 
that N has cadlag (continue a clroite, limites a gauche) sample paths. 

We continue with some comments on this definition and some immediate 
consequences. 

We know that a Poisson random variable M has the rare property that 
A = EM = var(M) , 

i.e., it is determined only by its mean value (= variance) if the distribution is 
specified as Poisson. The definition of the Poisson process essentially says that, 
in order to determine the distribution of the Poisson process N, it suffices to 
know its mean value function. The mean value function p can be considered 
as an inner clock or operational time of the counting process N. Depending 
on the magnitude of p(s, t ] in the interval (s, t], s < t, it determines how large 
the random increment N(s, t] is. 

Since iV(0) = 0 a.s. and p( 0) = 0, 

N(t) = N(t ) — 7V(0) = iV(0,t] ~ Pois(/z(0,i]) = Pois (p(t)) . 

We know that the distribution of a stochastic process (in the sense of 
Kolmogorov’s consistency theorem 1 ) is determined by its finite-dimensional 
distributions. The finite-dimensional distributions of a Poisson process have 
a rather simple structure: for 0 = to < t\ < ■ ■ ■ < t n < oo, 






N(h) + N(tut2\,N(ti) + N(ti,t2] + N(t2, <3], ■ • • , ^ ti]j . 

i - 1 

where any of the random variables on the right-hand side is Poisson dis- 
tributed. The independent increment property makes it easy to work with the 
finite-dimensional distributions of N: for any integers hi > 0, i = 1, . . . , n, 

1 Two stochastic processes on the real line have the same distribution in the sense 
of Kolmogorov’s consistency theorem (cf. Rogers and Williams [66], p. 123, or 
Billingsley [13], p. 510) if their finite-dimensional distributions coincide. Here one 
considers the processes as random elements with values in the product space 
jj>[o,°o) Q f real- valued functions on [0, 00 ), equipped with the a-field generated by 
the cylinder sets of Rf 0,oo k 
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P(N(ti) — k\ , N(t 2 ) — k\ + k2 , ■ ■ . , N{t n ) — fci + • • • + k n ) 

= P(N(t i) = fci , N(ti,t 2 ] = k 2 ,... , N(t n -i, t n \ = k n ) 

(/- 4 (^l)) fcl — ,i 2 l (MM^])* 2 „ (k(tn-l,tn]) krl 

k\\ k 2 \ k n \ 

-M(tn) (MM)* 1 (^(^1, t 2 ]) fc2 _ _ _ (n(tn-l,tn]) kn 

ki\ k 2 \ k n \ 

The cadlag property is nothing but a standardization property and of 
purely mathematical interest which, among other things, ensures the measur- 
ability property of the stochastic process N in certain function spaces. 2 As 
a matter of fact, it is possible to show that one can define a process N on 
[0, oo) satisfying properties (l)-(3) of the Poisson process and having sample 
paths which are left-continuous and have limits from the right. 3 Later, in Sec- 
tion 2.1.4, we will give a constructive definition of the Poisson process. That 
version will automatically be cadlag. 



2.1.1 The Homogeneous Poisson Process, the Intensity Function, 
the Cramer-Lundberg Model 

The most popular Poisson process corresponds to the case of a linear mean 
value function /.t: 



n(t) = A t, t > 0 , 

for some A > 0. A process with such a mean value function is said to be homo- 
geneous, inhomogeneous otherwise. The quantity A is the intensity or rate of 
the homogeneous Poisson process. If A = 1, N is called standard homogeneous 
Poisson process. 

More generally, we say that N has an intensity function or rate function 
A if fi is absolutely continuous, i.e., for any s < t the increment has 

representation 



MM = / A (y)dy, s <t, 

for some non-negative measurable function A. A particular consequence is that 
H is a continuous function. 

We mentioned that g can be interpreted as operational time or inner clock 
of the Poisson process. If N is homogeneous, time evolves linearly: fi(s,t) = 
g(s + h,t + h] for any h > 0 and 0 < s < t < oo. Intuitively, this means that 

2 A suitable space is the Skorokhod space D of cadlag functions on [0, oo); cf. 
Billingsley [12], 

3 See Chapter 2 in Sato [71]. 
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claims arrive roughly uniformly over time. We will see later, in Section 2.1.6, 
that this intuition is supported by the so-called order statistics property of a 
Poisson process. If N has non-constant intensity function A time “slows down” 
or “speeds up” according to the magnitude of A (t). In Figure 2.1.2 we illustrate 
this effect for different choices of A. In an insurance context, non-constant A 
may refer to seasonal effects or trends. For example, in Denmark more car 
accidents happen in winter than in summer due to bad weather conditions. 
Trends can, for example, refer to an increasing frequency of (in particular, 
large) claims over the last few years. Such an effect has been observed in 
windstorm insurance in Europe and is sometimes mentioned in the context of 
climate change. Table 3.2.18 contains the largest insurance losses occurring in 
the period 1970-2002: it is obvious that the arrivals of the largest claim sizes 
cluster towards the end of this time period. We also refer to Section 2.1.7 for 
an illustration of seasonal and trend effects in a real-life claim arrival sequence. 

A homogeneous Poisson process with intensity A has 

(1) cadlag sample paths, 

(2) starts at zero, 

(3) has independent and stationary increments, 

(4) N(t) is Pois(At) distributed for every t > 0. 

Stationarity of the increments refers to the fact that for any 0 < s < t and 
h > 0, 



N(s,t] = N(s + h,t + h] ~ Pois(A ( t — s)) , 

i.e. , the Poisson parameter of an increment only depends on the length of the 
interval, not on its location. 

A process on [0,oo) with properties (l)-(3) is called a Levy process. The 
homogeneous Poisson process is one of the prime examples of Levy processes 
with applications in various areas such as queuing theory, finance, insurance, 
stochastic networks, to name a few. Another prime example of a Levy process 
is Brownian motion B. In contrast to the Poisson process, which is a pure jump 
process, Brownian motion has continuous sample paths with probability 1 and 
its increments B(s, t] are normally N(0, a 2 (t — s)) distributed for some a > 0. 
Brownian motion has a multitude of applications in physics and finance, but 
also in insurance mathematics. Over the last 30 years, Brownian motion has 
been used to model prices of speculative assets (share prices, foreign exchange 
rates, composite stock indices, etc.). 

Finance and insurance have been merging for many years. Among other 
things, insurance companies invest in financial derivatives (options, futures, 
etc.) which are commonly modeled by functions of Brownian motion such as 
solutions to stochastic differential equations. If one wants to take into account 
jump characteristics of real-life financial/insurance phenomena, the Poisson 
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Figure 2.1.2 One sample path of a Poisson process with intensity 0.5 (top left), 1 
(top right) and 2 (bottom) . The straight lines indicate the corresponding mean value 
functions. For A = 0.5 jumps occur less often than for the standard homogeneous 
Poisson process, whereas they occur more often when A = 2. 



process, or one of its many modifications, in combination with Brownian mo- 
tion, offers the opportunity to model financial/insurance data more realisti- 
cally. In this course, we follow the classical tradition of non-life insurance, 
where Brownian motion plays a less prominent role. This is in contrast to 
modern life insurance which deals with the inter-relationship of financial and 
insurance products. For example, unit-linked life insurance can be regarded 
as classical life insurance which is linked to a financial underlying such as a 
composite stock index (DAX, S&P 500, Nikkei, CAC40, etc.). Depending on 
the performance of the underlying, the policyholder can gain an additional 
bonus in excess of the cash amount which is guaranteed by the classical life 
insurance contracts. 







18 



2 Models for the Claim Number Process 



Now we introduce one of the models which will be most relevant through- 
out this text. 

Example 2.1.3 (The Cramer-Lundberg model) 

The homogeneous Poisson process plays a major role in insurance mathemat- 
ics. If we specify the claim number process as a homogeneous Poisson process, 
the resulting model which combines claim sizes and claim arrivals is called 
Cramer-Lundberg model : 

• Claims happen at the arrival times 0 < T\ < < • • • of a homogeneous 

Poisson process N(t) = #{i > 1 : T) < t}, t > 0. 

• The itli claim arriving at time T,; causes the claim size Xj. The sequence 
(X,) constitutes an iid sequence of non-negative random variables. 

• The sequences (T,) and (X,) are independent. In particular, N and (X.;) 
are independent. 

The total claim amount process S in the Cramer-Lundberg model is also called 
a compound Poisson process. 

The Cramer-Lundberg model is one of the most popular and useful models 
in non-life insurance mathematics. Despite its simplicity it describes some of 
the essential features of the total claim amount process which is observed in 
reality. 

We mention in passing that the total claim amount process S in the 
Cramer-Lundberg setting is a process with independent and stationary in- 
crements, starts at zero and has cadlag sample paths. It is another important 
example of a Levy process. Try to show these properties! □ 

Comments 

The reader who wants to learn about Levy processes is referred to Sato’s 
monograph [71]. For applications of Levy processes in different areas, see the 
recent collection of papers edited by Barndorff-Nielsen et al. [9]. Rogers and 
Williams [66] can be recommended as an introduction to Brownian motion, 
its properties and related topics such as stochastic differential equations. For 
an elementary introduction, see Mikosch [57]. 

2.1.2 The Markov Property 

Poisson processes constitute one particular class of Markov processes on [0, oo) 
with state space No = {0,1, . . .}. This is a simple consequence of the inde- 
pendent increment property. It is left as an exercise to verify the Markov 
property, i.e. , for any 0 = to < ti <■■■< t n and non-decreasing natural 
numbers ki > 0, i = 1, . . . , n, n > 2, 

P(N(t n ) = k n | N(ti) = h , . . . , N(t n - 1) = k n -i) 

= P(N(t n ) = k n I N(tn- 1) = k„- 1) . 
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Markov process theory does not play a prominent role in this course, 4 in 
contrast to a course on modern life insurance mathematics, where Markov 
models are fundamental. 5 However, the intensity function of a Poisson process 
N has a nice interpretation as the intensity function of the Markov process 
N. Before we make this statement precise, recall that the quantities 

Pk,k+h(s, t) = P(N(t) = k + h | N(s) = k) = P(N(t ) - N(s) = h ) , 

0 < s <t, k , h e No , 

are called the transition probabilities of the Markov process N with state space 
No- Since a.e. path (N(t,ui)) t > o increases with probability 1 (verify this), one 
only needs to consider transitions of the Markov process N from k to k + h for 
h > 0. The transition probabilities are closely related to the intensities which 
are given as the limits 



A k,k+h(t) = lim 

sj.0 



Pk,k+h(t, t + s) 



provided they and their analogs from the left exist, are finite and coincide. 
From the theory of stochastic processes, we know that the intensities and 
the initial distribution of a Markov process determine the distribution of this 
Markov process. 6 

Proposition 2.1.4 (Relation of the intensity function of the Poisson process 
and its Markov intensities) 

Consider a Poisson process N = (N(t))t>o which has a continuous intensity 
function A on [0,oo). Then, for k > 0, 



A k,k+h(t) 



A (t) if h= 1 , 

0 if h > 1 . 



In words, the intensity function A (t) of the Poisson process N is nothing but 
the intensity of the Markov process N for the transition from state k to state 
k + 1 . The proof of this result is left as an exercise. 

The intensity function of a Markov process is a quantitative measure of 
the likelihood that the Markov process N jumps in a small time interval. An 
immediate consequence of Proposition 2.1.4 is that is it is very unlikely that 
a Poisson process with continuous intensity function A has jump sizes larger 

4 It is, however, no contradiction to say that almost all stochastic models in this 
course have a Markov structure. But we do not emphasize this property. 

5 See for example Roller [52]. 

6 We leave this statement as vague as it is. The interested reader is, for example, 
referred to Resnick [65] or Rogers and Williams [66] for further reading on Markov 
processes. 
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than 1. Indeed, consider the probability that N has a jump greater than 1 in 
the interval (t, t + s] for some t > 0, s > 0: 7 

P(N(t, t + s] > 2) = 1 - P(N(t, t+s]=0)- P(N(t, t + s\ = 1) 

= 1 - e _At(M+s| - n(t, t + s) e -tdM+d _ (2.1.1) 



Since A is continuous, 

nt-\-S 

y(t,t+ s] = J X(y) dy = s A(t) (1 + o(l)) — >0, ass|0. 

Moreover, a Taylor expansion yields for x — > 0 that e x = 1 + x + o(x) . Thus 
we may conclude from (2.1.1) that, as s j 0, 

P(N(t, t + s] > 2) = t + s]) = o(s) . (2.1.2) 

It is easily seen that 

P(N(t, t + s\ = 1) = A (t) s (1 + o(l)) . (2.1.3) 

Relations (2.1.2) and (2.1.3) ensure that a Poisson process N with continuous 
intensity function A is very unlikely to have jump sizes larger than 1. Indeed, 
we will see in Section 2.1.4 that N has only upward jumps of size 1 with 
probability 1. 



2.1.3 Relations Between the Homogeneous and the 
Inhomogeneous Poisson Process 

The homogeneous and the inhomogeneous Poisson processes are very closely 
related: we will show in this section that a deterministic time change trans- 
forms a homogeneous Poisson process into an inhomogeneous Poisson process, 
and vice versa. 

Let N be a Poisson process on [0, oo) with mean value function 8 y. We 
start with a standard homogeneous Poisson process N and define 

N(t) = N(y(t)) , t > 0 . 

It is not difficult to see that N is again a Poisson process on [0,oo). (Verify 
this! Notice that the cadlag property of y is used to ensure the cadlag property 
of the sample paths N(t,ui).) Since 

7 Here and in what follows, we frequently use the o-notation. Recall that we write for 
any real- valued function h, h(x) = o( 1 ) as x — > xo £ [—00,00] if lim^^^o h(x) = 0 
and we write h(x) = o(g(x)) as x — > xo if h(x) = g(x)o(l) for any real-valued 
function g(x). 

8 Recall that the mean value function of a Poisson process starts at zero, is non- 
decreasing, right-continuous and finite on [ 0 , 00). In particular, it is a cadlag 
function. 
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/}( t ) = EN(t ) = EN(/j,(t)) = /z(t) , £ > 0 , 

and since the distribution of the Poisson process N is determined by its mean 
value function fi, it follows that N = N, where = refers to equality of the 
finite-dimensional distributions of the two processes. Hence the processes N 
and N are not distinguishable from a probabilistic point of view, in the sense 
of Kolmogorov’s consistency theorem; see the remark on p. 14. Moreover, the 
sample paths of N are cadlag as required in the definition of the Poisson 
process. 

Now assume that N has a continuous and increasing mean value function 
fx. This property is satisfied if N has an a.e. positive intensity function A. Then 
the inverse /.t -1 of p exists. It is left as an exercise to show that the process 
N(t) = N(p~ 1 (t,)) is a standard homogeneous Poisson process on [0,oo) if 
liirit^oc p(t) = oo. 9 

We summarize our findings. 

Proposition 2.1.5 (The Poisson process under change of time) 

Let p be the mean value function of a Poisson process N and N be a standard 
homogeneous Poisson process. Then the following statements hold: 

(1) The process (N(p(t)))t > o is Poisson with mean value function p. 

(2) If p is continuous, increasing and lim^oo p(t) = oo then ( N(p~ 1 (t)))t>o 
is a standard homogeneous Poisson process. 

This result, which immediately follows from the definition of a Poisson process, 
allows one in most cases of practical interest to switch from an inhomogeneous 
Poisson process to a homogeneous one by a simple time change. In particular, 
it suggests a straightforward way of simulating sample paths of an inhomoge- 
neous Poisson process N from the paths of a homogeneous Poisson process. 
In an insurance context, one will usually be faced with inhomogeneous claim 
arrival processes. The above theory allows one to make an “operational time 
change” to a homogeneous model for which the theory is more accessible. See 
also Section 2.1.7 for a real-life example. 

2.1.4 The Homogeneous Poisson Process as a Renewal Process 

In this section we study the sequence of the arrival times 0 < T\ < T^ < • • • 
of a homogeneous Poisson process with intensity A > 0. It is our aim to find 
a constructive way for determining the sequence of arrivals, which in turn 
can be used as an alternative definition of the homogeneous Poisson process. 
This characterization is useful for studying the path properties of the Poisson 
process or for simulating sample paths. 

9 If lim t ^oo p{t) = yo < oo for some yo > 0, /r _1 is defined on [0, j/o) and N(t) = 
!V(/i -1 (i)) satisfies the properties of a standard homogeneous Poisson process 
restricted to the interval [0, yo). In Section 2.1.8 it is explained that such a process 
can be interpreted as a Poisson process on [0, yo). 
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We will show that any homogeneous Poisson process with intensity A > 0 
has representation 



N(t) = #{i>l: Ti<t}, t> 0, (2.1.4) 

where 

T n = W\ + • • • + W n , n > 1 , (2.1.5) 

and {Wi) is an iid exponential Exp(A) sequence. In what follows, it will be 
convenient to write To = 0. Since the random walk (T n ) with non-negative 
step sizes W n is also referred to as renewal sequence , a process N with rep- 
resentation (2.1.4)-(2.1.5) for a general iid sequence (Wi) is called a renewal 
( counting ) process. We will consider general renewal processes in Section 2.2. 

Theorem 2.1.6 (The homogeneous Poisson process as a renewal process) 

(1) The process N given by (2.1.4) and (2.1.5) with an iid exponential Exp(A) 
sequence (Wi) constitutes a homogeneous Poisson process with intensity 

A > 0. 

(2) Let N be a homogeneous Poisson process with intensity A and arrival 
times 0 < Ti < T 2 < • • • . Then N has representation (2.1.4), and (Tf) 
has representation (2.1.5) for an iid exponential Exp(A) sequence (Wi). 

Proof. (1) We start with a renewal sequence (T„) as in (2.1.5) and set To = 
0 for convenience. Recall the defining properties of a Poisson process from 
Definition 2.1.1. The property N( 0) = 0 a.s. follows since W\ > 0 a.s. By 
construction, a path (N(t,u>))t> 0 assumes the value i in [Tj,Tj + i) and jumps 
at Tj + i to level i + 1. Hence the sample paths are cadlag; cf. p. 14 for a 
definition. 

Next we verify that N(t) is Pois(Af) distributed. The crucial relationship 
is given by 



{N(t.) = n} = {T n < t <T n+ 1 }, n > 0 . (2.1.6) 

Since T n = W\ + • • • + W„ is the sum of n iid Exp(A) random variables it is a 
well-known property that T n has a gamma T(n, A) distribution 10 for n > 1: 

n— 1 /x n. fc 

P(T n < x) = 1 - e~ Xx 53^-, 2 : > 0 . 

k—0 



Hence 

P(N(t) =n) = P(T n < t) - P(T n+1 <t) = e~ xt . 

nl 

10 You can easily verify that this is the distribution function of a Y(n, A) distribution 
by taking the first derivative. The resulting probability density has the well-known 
gamma form A (Ax) n_1 e ~ Xx /(n — 1)!. The Y(n, A) distribution for n £ N is also 
known as the Erlang distribution with parameter (n, A). 
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This proves the Poisson property of N(t). 

Now we switch to the independent stationary increment property. We use 
a direct “brute force” method to prove this property. A more elegant way via 
point process techniques is indicated in Resnick [65], Proposition 4.8.1. Since 
the case of arbitrarily many increments becomes more involved, we focus on 
the case of two increments in order to illustrate the method. The general 
case is analogous but requires some bookkeeping. We focus on the adjacent 
increments N(t) = N( 0, t] and N(t,t+ h] for t,h > 0. We have to show that 
for any k, l € No, 

Qk,k+l(t,t + h) = 



We start with the case l = 0, k > 1; the case l = k = 0 being trivial. We make 
use of the relation 



P(N(t) = k , N(t, t+ h] = l) 
P(N(t) = k)P(N(t,t + h\=l) 
P(N(t) = k) P(N(h) = l ) 
c -x(t +h ) (At) fc (Afe)\ 



{N(t) = k , N(t, t + h\ = l} = {N(t) = k , N(t + h) = k + 1} . (2.1.8) 

Then, by (2.1.6) and (2.1.8) , 

Qk,k+l{t, t + h) = P(Tk < t < Tfc + 1 ,Tk < t + h < Tfc+i) 

= P{Tk < t ,t + h < Tj- + W fc+ i) . 

Now we can use the facts that Tj. is r(k, A) distributed with density X k 



k— 1 ^ —Xx 



x e 



/( k — 1)! and l-T/c+i is Exp(A) distributed with density Ae 



— Xx . 



-Xz X ( Xz ) 



k— l r°° 



A e Xx dxdz 



dk,k+i(t,t + h) — e . i / 

Jo (k — 1)! Jt+h- 

_ f* -\z A (A z) k ~ x x(t+h -z) d 

~ Jo (fc-1)! 6 



_ g — ^ (t+h) 



(a ty 

k\ 



For Z > 1 we use another conditioning argument and (2.1.6): 

q k ,k+i(t,t + h) 

= P{Tk <t< T k+ 1 , T k +i <t + h< T k+ i + 1) 

= E[I{ Tk < t<Tk+1 < t+h y 

P{Tk+i — Tk + 1 < t, + h — T k +i < Tk+i+i — Tk+\ | T k , T^+i)] . 
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Let N' be an independent copy of N , i.e. , N' = N. Appealing to (2.1.6) and 
the independence of T k +i and (T k +i — T k+ i,T k +i+i — T k + 1 ), we see that 



Qk,k+l{t, t + h) 

= E V{T k <t<T k+1 <t+h} P(N'(t + h - T k+ 1 ) = l - 1 | Tk+i)] 



= / 6 

J o 



-Az A ( A *0 



k— 1 rt.+h—z 

— / Xe~ Xx P(N(t + h — z — x) = l — 1) dx dz 

J t—Z 



(k - 1 ) 



= / e 



4 _ Xz X(Xz) k ~ 1 f t+h ~ z 



(k-iy. 



Ae _Al e 



_ AXp _A (t+h-z-x) + Z-x)) 1 1 

dx dz 



= e -A (i+h) rx(\z) k ~' f h X (Xx) 1 - 1 



1 0 (fc- 1 )! 



■ dz 



( 1 - 1 )! 



dx 



.-Aft+W ( A ^ ( A/l V 

k\ l\ ' 

This is the desired relationship (2.1.7). Since 

OO 

P(N(t, t+h]=l) = J2 p ( N (t) = k » t + h\ = l ) , 

k—0 



it also follows from (2.1.7) that 

P(N(t) = k , N(t, t + h\ = l) = P(N(t) = k) P(N(h) = /) . 

If you have enough patience prove the analog to (2.1.7) for finitely many 
increments of N. 

(2) Consider a homogeneous Poisson process with arrival times 0 < T\ < X 2 < 
• • • and intensity A > 0. We need to show that there exist iicl exponential 
Exp(A) random variables W t such that T n = W\ + ■ ■ ■ + W n , i.e., we need to 
show that, for any 0 < 24 < X 2 < • • • < x n , n > 1 , 



P{T\ < xi , . . . ,T n < x n ) 

= P(W 1 < xi , . . . , W[ + • • • + W n < x n ) 



Xl PX 2 —W 1 

Xe~ Xwi / Xe~ Xw2 

Wi—0 J U>2— 0 




dw n ■ • ■ dw 1 . 



The verification of this relation is left as an exercise. Hint: It is useful to 
exploit the relationship 



{T\<x 1 , . . . , T n < x n } = {N(x 1 ) > 1 , . . . , N(x n ) > n} 
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for 0 < x\ < • • • < x n , n> 1. □ 

An important consequence of Theorem 2.1.6 is that the inter-arrival times 

Wi = Ti- Tj_ i , i > 1 , 

of a homogeneous Poisson process with intensity A are iid Exp(A). In partic- 
ular, Ti < Tj + i a.s. for i > 1, i.e., with probability 1 a homogeneous Poisson 
process does not have jump sizes larger than 1. Since by the strong law of 
large numbers T n /n EW\ = A -1 > 0, we may also conclude that T n grows 
roughly like n/ A, and therefore there are no limit points in the sequence (T n ) 
at any finite instant of time. This means that the values N(t) of a homoge- 
neous Poisson process are finite on any finite time interval [0,f]. 

The Poisson process has many amazing properties. One of them is the 
following phenomenon which runs in the literature under the name inspection 
paradox. 

Example 2.1.7 (The inspection paradox) 

Assume that you study claims which arrive in the portfolio according to a 
homogeneous Poisson process N with intensity A. We have learned that the 
inter-arrival times W n = T n — T„_ i, n > 1, with To = 0, constitute an iid 
Exp(A) sequence. Observe the portfolio at a fixed instant of time t. The last 
claim arrived at time T mt ) and the next claim will arrive at time T N ^ +1 . 
Three questions arise quite naturally: 

(1) What is the distribution of B(t) = t — T v q), i.e., the length of the period 
(Tjv(t) , t ) since the last claim occurred? 

(2) What is the distribution of F(t) = T N ( t ) +1 — t, i.e., the length of the period 
(£,Tjv(t)+i] until the next claim arrives? 

(3) What can be said about the joint distribution of B(t) and F(t)l 

The quantity B(t) is often referred to as backward recurrence time or age, 
whereas F{t) is called forward recurrence time, excess life or residual life. 

Intuitively, since t lies somewhere between two claim arrivals and since the 
inter-arrival times are iid Exp(A), we would perhaps expect that P{B{t ) < 
X\) < 1 — e~ Xxi , x\ < t, and P(F(t) < X 2 ) < 1 — e~ Xx2 , X 2 > 0. However, 
these conjectures are not confirmed by calculation of the joint distribution 
function of B(t) and F(t) for x±,X 2 > 0: 

G B (t),F(t){x i,x 2 ) = P{B(t ) < x\ , F(t) < x 2 ) ■ 

Since B(t) < t a.s. we consider the cases x\ < t and x\ > t separately. We 
observe for x\ < t and X 2 > 0, 

{B(t) < X\) = {t-x 1 < T nw < t) = {N(t - xi ,t] > 1} , 

{F(t) < x 2 } = {t < Tjv( t ) +1 <t + x 2 } = {N(t,t + x 2 ] > 1} . 

Hence, by the independent stationary increments of N, 
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G B (t),F(t){xi,x 2 ) = P(N(t-X!,t] > 1 ,N(t,t + x 2 \ > 1) 

= P (. N(t - xi, t] > 1 ) P ( N(t , t + x 2 ] > 1) 

= (1 -e" Axi ) (1 -e" Ax2 ) . (2.1.9) 

An analogous calculation for x\ > t, X 2 > 0 and (2.1.9) yield 

GB(t),F(t){x 1 , 2 : 2 )= [(1 — e Al1 ) I[o j t)(a;i) + /[ ti00 )(a;i)] (l — e Al2 ) . 

Hence £?(f) and F(t) are independent, F(t) is Exp(A) distributed and B(t) 
has a truncated exponential distribution with a jump at t : 

P{B{t) < x i) = l — e -Axi , x\ <t, and P{B{t) = t) = e~ At . 

This means in particular that the forward recurrence time F(t) has the same 
Exp(A) distribution as the inter-arrival times Wi of the Poisson process N. 
This property is closely related to the forgetfulness property of the exponential 
distribution: 



P{W\ > x + y | Wi > x) = P(Wi > y) , x , y > 0 , 

(Verify the correctness of this relation.) and is also reflected in the independent 
increment property of the Poisson property. It is interesting to observe that 

lim P{B{t ) < x\ ) = 1 — e ~ Xxi , x\ > 0 . 

t—> OO 

Thus, in an “asymptotic “ sense , both B(t) and F(t) become independent and 
are exponentially distributed with parameter A. 

We will return to the forward and backward recurrence times of a general 
renewal process, i.e., when Wi are not necessarily iicl exponential random 
variables, in Example 2.2.14. □ 

2.1.5 The Distribution of the Inter- Arrival Times 

By virtue of Proposition 2.1.5, an inhomogeneous Poisson process N with 
mean value function p can be interpreted as a time changed standard homo- 
geneous Poisson process N : 

(N(t))t>o = (N(p(t))) t > 0 • 

In particular, let (T)) be the arrival sequence of N and p be increasing and 
continuous. Then the inverse p^ 1 exists and 

N'(t) = #{i > 1 : % < p{t)} = #{i > 1 : p~ l (Ti ) < t} , t > 0 , 

is a representation of N in the sense of identity of the finite-dimensional 
distributions, i.e., N = N'. Therefore and by virtue of Theorem 2.1.6 the 
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arrival times of an inhomogeneous Poisson process with mean value function 
p have representation 

T n = n~ 1 {T n ), T n = W 1 + --- + W n , n> 1, Wi iid Exp(l). 

(2.1.10) 

Proposition 2.1.8 (Joint distribution of arrival/inter-arrival times) 

Assume N is a Poisson process on [0,oo) with a continuous a.e. positive in- 
tensity function A. Then the following statements hold. 

(1) The vector of the arrival times (Ti, . . . , T n ) has density 

n 

fT lt ...,T n {xi,...,X n ) =e _/i(x " ) ]^[ X(Xi) I{0<xi<—<x n ] ■ (2.1.11) 

i= 1 

(2) The vector of inter-arrival times (Wi , . . . , W n ) = (Ti, T 2 — Tj, . . . , T n — 
T„_ 1 ) has density 

n 

/wi,...,w n (a:i, ■■■,x n ) = e-^ Xl+ - +x ") A(aq H ha;*), Xi > 0 . 

»= 1 

(2.1.12) 

Proof. Since the intensity function A is a.e. positive and continuous, p{f) = 
fo A(s) ds is increasing and exists. Moreover, p is differentiable, and 
ljf(t) = A (t). We make use of these two facts in what follows. 

(1) We start with a standard homogeneous Poisson process. Then its arrivals 
T n have representation T n = W\ + ■ ■ ■ + W n for an iid standard exponential 
sequence {Wi). The joint density of (Tj,...,T n ) is obtained from the joint 
density of (Wi, . . . , W n ) via the transformation: 

{yi,---,y n ) ->■ {yi,yi+V 2 ,---,yi-\ b y n ), 

5-1 

(zi,...,z n ) —> (zi, Z2 - Zl, . . . , Z n - Z n -i) . 

Note that det(dS(y)/dy) = 1. Standard techniques for density transforma- 
tions (cf. Billingsley [13], p. 229) yield for 0 < x\ < ■ ■ ■ < x n , 

ffi,...,T n (xi? • * ' 1 X n ) fwi,...,W n X2 Xli • * ■ 1 X n X n —i) 

— e -®1 e -O2-X1) . . . e -(x„-X„_l) _ e -x„ ^ 

Since fi~ x exists we conclude from (2.1.10) that for 0 < X\ < ■ ■ ■ < x n , 

P{T\ < X ! ,. . . ,T n < x n ) = P(/r _1 (Ti) < X ! ,. . • , n~ 1 {T n ) < x n ) 

= P{T\ < n{x 1 ) ,...,T n < n(x n )) 
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/j 1 ! (^/i 5 • • • ) Un) dy n • • • dyi 





e ^{yi< ---< 3 /^} dy n ---dyi. 



Taking partial derivatives with respect to the variables n, . . . , x n and noticing 
that y'(xi) = \{xi ), we obtain the desired density ( 2 . 1 . 11 ). 

(2) Relation (2.1.12) follows by an application of the above transformations 
S and S ' -1 from the density of (T), . . . , T„): 



fw u ...,W n ( w 1> • • -,Wn) = fT u ...,T n { W 1)^1 + W 2 , • • • H b Wn) • 



□ 

From (2.1.12) we may conclude that the joint density of W \, . . . , W n can be 
written as the product of the densities of the Wj’s if and only if A(-) = A 
for some positive constant A. This means that only in the case of a homo- 
geneous Poisson process are the inter-arrival times ITj , . . . , W n independent 
(and identically distributed). This fact is another property which distinguishes 
the homogeneous Poisson process within the class of all Poisson processes on 
[0, oo). 



2.1.6 The Order Statistics Property 

In this section we study one of the most important properties of the Poisson 
process which in a sense characterizes the Poisson process. It is the order 
statistics property which it shares only with the mixed Poisson process to be 
considered in Section 2.3. In order to formulate this property we first give a 
well-known result on the distribution of the order statistics 

-^(l) < ’ ’ ’ < -^(n) 

of an iid sample X \ , . . . , X n . 

Lemma 2.1.9 (Joint density of order statistics) 

If the iid Xi ’s have density f then the density of the vector (X(i), • ■ • , X („)) 
is given by 

n 

fx^ i) ,...,X( n )(x i ? . . . , x n ) n\ | f (xf) I{ Xl <---<x n } • 

i= 1 

Remark 2.1.10 By construction of the order statistics, the support of the 
vector (^f(i) , . . . , X („)) is the set 

C n = {On, • ■ • ,x n ) : n < • • • < x n } C I n , 

and therefore the density fx (1) ,...,x M vanishes outside C n . Since the existence 
of a density of X< implies that all elements of the iid sample Xi , , X n are 
different a.s., the <’s in the definition of C n could be replaced by <’s. □ 
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Proof. We start by recalling that the iid sample X\ , . . . ,X n with common 
density / has no ties. This means that the event 

a = {^(i) < • • • < X {n) } = {Xi ^ Xj for 1 < i < j < n} 

has probability 1. It is an immediate consequence of the fact that for i ^ j, 

P(Xi = X j) = E[P(Xi = X j | X^] = [ P(Xi = y) f(y) dy = 0 , 

Jr 

since P(X t = y) = J {y} f(z) dz = 0. Then 

1-P(rt)=pf (J {X^X^U Y, P(X i = X j ) = 0. 

yl<i<j<n J l<i<j<n 

Now we turn to the proof of the statement of the lemma. Let I7„ be the set 
of the permutations it of ?z} . Fix the values x\ < • • • < x n . Then 



P(X(1) ,X (n) <* n ) = p( (J j , (2.1. 

\7re/7„ / 



13) 



where 



(X^^) X{%) )i 1 , . . . , n 12 P) {X w(1) ^ xi , . . . , < x n } . 

The identity (2.1.13) means that the ordered sample X^) < • • • < X(„) could 
have come from any of the ordered values X^m < • • • < X ff („), 7r G II n , where 
we also make use of the fact that there are no ties in the sample. Since the 
A^’s are disjoint, 

p ( U J = E p ^ ■ 

XTTdzIIn J ItdzIIn 

Moreover, since the X^s are iid, 

-^(-^■7r) P ((-^7 t( 1) ? • • • ? (n)) ^ ^ ( OO, X\\ X • • • X ( 00, 3?n]) 

— ^((^1, • • • 5 X n ^ G Cn ^ ( 00, (£]_] X ••• X ( 00, 3?n]) 

/ X\ nx n n 

■ II f{Vi) hvi<-<v n } d Vn ■■■dyi- 

-OO J — OO i—i 

Therefore and since there are n! elements in 17 n , 
p (*( D< x lt ... , X (n ) < a: n ) 

/ Xl fX n n 

■ ■■ n\ n f(Vi) 1 {yi< ~< Vn} dyw- dyi . (2.1.14) 

-OO J— OO a 1 
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By Remark 2.1.10 about the support of (X^), . . . , X^) and by virtue of the 
Radon-Nikodym theorem, we can read off the density of (XW, . . . , X(„)) as 
the integrand in (2.1.14). Indeed, the Radon-Nikodym theorem ensures that 
the integrand is the a.e. unique probability density of (Xqi) , . . . , X^). 11 □ 
We are now ready to formulate one of the main results of this course. 

Theorem 2.1.11 (Order statistics property of the Poisson process) 

Consider the Poisson process N = (N(t))t > o with continuous a.e. positive 
intensity function A and arrival times 0 < T) < T 2 < ■ • • a.s. Then the 
conditional distribution of(T\, . . . ,T n ) given {N(t) = n} is the distribution of 
the ordered sample (XTm, . . . , X( n )) of an iid sample Xi, . . . ,X n with common 
density X (x)/p(t), 0 < x < t : 



(T u ...,T n \N(t) = n)±(X {1) ,...,X {n) ). 



In other words, the left-hand vector has conditional density 



/ti,...,t„(zi, • • ■ ,x n | N(t) = n) 



n\ 



n K^i) , 

2 — 1 



(2.1.15) 



0 < x\ < ■ ■ ■ < x n < t . 

Proof. We show that the limit 

j. P(Ti € (xi, X! + h 1 ] , . . . , T. n € (x n , x n + h n ] | N(t) = n) 

hilO , 2=1,. ..,72 h [ ■ ■ ■ h n 

(2.1.16) 

exists and is a continuous function of the xfs. A similar argument (which 
we omit) proves the analogous statement for the intervals (xi — hi,xf\ with 
the same limit function. The limit can be interpreted as a density for the 
conditional probability distribution of (Tf, . . . , T n ), given {N(t) = n}. 

Since 0 < X\ < • • • < x n < t we can choose the hf s so small that the 
intervals (. Xi , Xi + hi] C [0, t], i = 1, . . . , n, become disjoint. Then the following 
identity is immediate: 

\T\ c (xi , Xi + hi] ,. . . , T n c ( x n , x n h n ] , A^(t) — n) 



= {N( 0, xi\ = 0 , N(xi,xi + h\] = 1 , N( xi + hi,x 2 ] = 0 , 

N(x 2 , x 2 + h 2 ] = l,... , N(x n - 1 + h n -i,x n ] = 0 , 

N (x n , x n “t - h n ] — 1 , N ( x n -I - h n , tj — 0} . 

11 Relation (2.1.14) means that for all rectangles R = (— 00 , * 1 ] x • • • x (— 00 , x„] with 
0 < xi < ■ ■ ■ < x n and for X n = (X (1) , . . . , X (n) ), P(X n £ R) = f R / x „ (x) dx. 
By the particular form of the support of X„, the latter relation remains valid for 
any rectangles in R”. An extension argument (cf. Billingsley [13]) ensures that 
the distribution of X n is absolutely continuous with respect to Lebesgue measure 
with a density which coincides with / Xn on the rectangles. The Radon-Nikodym 
theorem ensures the a.e. uniqueness of / x „ . 
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Taking probabilities on both sides and exploiting the independent increments 
of the Poisson process N, we obtain 

P(Ti G (xi,xi + hi},... , T n G (x n , x n + h n \ , N(t ) = n ) 

= P(N( 0, ai] = 0) P{N(x i, ai + hi] = 1) P(N(xi + h u x 2 } = 0) 

P{N( x 2 ,x 2 + h 2 \ = 1) • • • P(N(x n -i + h n -i,x n \ = 0) 

P(N(x n ,x n + h n \ = 1) P(N(x n + h n , t\ = 0) 

= e -^ Xl) \ii{x u xx + hx\e-^ xx ’ Xl+hl A e ~^ xi+hl ’ x ^ 

n(x 2, X 2 + h 2 ] e . . . e -m(x„-H-/i„-i,*„] 

/r(x„, z n + /i n ] e -/*(x».x»+fc»ll e -^xn+W] 

— e ^ h(x i, T Zir] ■ ■ ■ p[x n , x n T /i^] . 

Dividing by P(N(t) = n) = e~^ (n(t)) n /n\ and h\---h n , we obtain the 
scaled conditional probability 




0.0 0.2 0.4 0.6 0.8 1.0 



Figure 2.1.12 Five realizations of the arrival times Ti of a standard homogeneous 
Poisson process conditioned to have 20 arrivals in [0,1], The arrivals in each row 
can be interpreted as the ordered sample of an iid U(0, 1) sequence. 
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P(Ti € (si, xi + h i] , . . . , T n € jx„, x n + h„] | 7V(f) = n) 
hi ■ ■ ■ h n 

n! n(xi,xi + fei] /. i(x n ,x n + fen] 

h n 

as hi [ 0, i = 1, . . . , ?i. 

Keeping in mind (2.1.16), this is the desired relation (2.1.15). In the last step 
we used the continuity of A to show that fi'(xi) = A (a;,). □ 

Example 2.1.13 (Order statistics property of the homogeneous Poisson pro- 
cess) 

Consider a homogeneous Poisson process with intensity A > 0. Then Theo- 
rem 2.1.11 yields the joint conditional density of the arrival times T,: 

/t 1 ,...,T n (xi,- -^Xn | N(t) = n) = n\ t~ n , 0 < xi < ■ ■ ■ < x n < t . 

A glance at Lemma 2.1.9 convinces one that this is the joint density of a 
uniform ordered sample Um < ■ ■ ■ < U( n ) of iid U(0, t) distributed Ui, . . . , U n . 
Thus, given there are n arrivals of a homogeneous Poisson process in the 
interval [0, f], these arrivals constitute the points of a uniform ordered sample 
in (0,f). In particular, this property is independent of the intensity A! □ 

Example 2.1.14 (Symmetric function) 

We consider a symmetric measurable function g on R", i.e., for any permuta- 
tion 7r of {1, . . . , n} we have 



W)) n 

n! 

( n(t\\ n 



hi 

X(xi) ■ ■ ■ A(x„) , 



g(x ± , • • • , X n ) q{,X- 7r(l) , • • • , ^-^(n)) • 

Such functions include products and sums: 



9s{x i, . . . , x n ) = ^ Xi , g p (x i, .. .,£„) = JJ a;, . 

2=1 1=1 

Under the conditions of Theorem 2.1.11 and with the same notation, we con- 
clude that 

(g(T 1, . . . , T n ) I N(t) = n) i g(X (1) ,. . . , X (n) ) = g(X u . . . , X n ) . 

For example, for any measurable function / on R, 




Nit) = ' 



n n 

= E/( x w) = E/(^)- 

i=l i= 1 



□ 




2.1 The Poisson Process 



33 



Example 2.1.15 (Shot noise) 

This kind of stochastic process was used early on to model an electric current. 
Electrons arrive according to a homogeneous Poisson process N with rate 
A at times T). An arriving electron produces an electric current whose time 
evolution of discharge is described as a deterministic function / with /(f) = 0 
for t < 0. Shot noise describes the electric current at time f produced by all 
electrons arrived by time f as a superposition: 

N(t) 

s(t ) = £/(*- T <) ■ 

i— 1 

Typical choices for / are exponential functions /(f) = e /[ 0iOO )(t), 9 > 0. 
An extension of classical shot noise processes with various applications is the 
process 



N(t) 



t > 0. 



(2.1.17) 



where 

• (X,) is an iicl sequence, independent of (T,;). 

• / is a deterministic function with /(f) = 0 for f < 0. 

For example, if we assume that the X/s are positive random variables, S(t) is 
a generalization of the Cramer-Lundberg model, see Example 2.1.3. Indeed, 
choose / = /[o,oo)) then the shot noise process (2.1.17) is the total claim 
amount in the Cramer-Lundberg model. In an insurance context, / can also 
describe delay in claim settlement or some discount factor. 

Delay in claim settlement is for example described by a function / satis- 
fying 

• /(f) = 0 for f < 0, 

• /(f) is non-decreasing, 

• Hindoo /(f) = 1 . 

In contrast to the Cramer-Lundberg model, where the claim size X,; is paid off 
at the time Ti when it occurs, a more general payoff function /(f) allows one 
to delay the payment, and the speed at which this happens depends on the 
growth of the function /. Delay in claim settlement is advantageous from the 
point of view of the insurer. In the meantime the amount of money which was 
not paid for covering the claim could be invested and would perhaps bring 
some extra gain. 

Suppose the amount Yj is invested at time T,; in a riskless asset (savings 
account) with constant interest rate r > 0, (1);) is an iid sequence of positive 
random variables and the sequences (Yj) and (T,) are independent. Contin- 
uous compounding yields the amount exp{r(f — T,;)} Yj at time f > Tj. For 
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iid amounts Y, which are invested at the arrival times Tj of a homogeneous 
Poisson process, the total value of all investments at time t is given by 

N(t) 

5iW=E eP(t - Ii) ^. t ~°- 

i- 1 

This is another shot noise process. 

Alternatively, one may be interested in the present value of payments Y, 
made at times T t in the future. Then the present value with respect to the 
time frame [0,<] is given as the discounted sum 

N(t ) 

S 2 (t) = J2 e ~ r(t ~ Ti)Y i’ t>0. 

2—1 

A visualization of the sample paths of the processes Si and S 2 can be found 
in Figure 2.1.17. □ 

The distributional properties of a shot noise process can be treated in the 
framework of the following general result. 

Proposition 2.1.16 Let (X,) be an iid sequence, independent of the sequence 
(Tf) of arrival times of a homogeneous Poisson process N with intensity X. 
Then for any measurable function g : M 2 — > R. the following identity in distri- 
bution holds 

N(t) N(t) 

S(t ) = g(T.i , Xi) ± Y 9(t Ui , Xi ) , 

2 = 1 2 = 1 



where (Ui) is an zzdU(0, 1) sequence, independent of (Xf) and (Tf). 

Proof. A conditioning argument together with the order statistics property 
of Theorem 2.1.11 yields that for x £ K, 



(N{t) 

p f Y 9( T i, X i) < X | 



N(t) = n = P 



V2 = l 



g(tU(i),Xi) < x 



where U \ , . . . , U n is an iid U(0, 1) sample, independent of (Xi) and (Tf), and 
U( i), . . . , t/(„) is the corresponding ordered sample. By the iid property of (Xi) 
and its independence of (Ui), we can permute the order of the Xfs arbitrarily 
without changing the distribution of Y^i=i 9{tU(i)i x i) : 

Y g ^ tU ( i )' Xi ^ < x j=E 

2=1 / 




P 
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Figure 2.1.17 Visualization of the paths of a shot noise process. Top: 80 paths 
of the processes Yie r ^~ Ti \ t > T), where ( Ti ) are the point of a Poisson process 
with intensity 0.1, (Yf) are iid standard exponential, r = —0.01 (left) and r = 
0.001 (right). Bottom: The corresponding paths of the shot noise process S(t) = 
X^T <t e r ^ -T *^ presented as a superposition of the paths in the corresponding top 
graphs. The graphs show nicely how the interest rate r influences the aggregated value 
of future claims or payments Yi. We refer to Example 2.1.15 for a more detailed 
description of these processes. 
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= E 



P 



^ ' g(t U(i)i ) — % 



vi= 1 




(2.1.18) 



where 7 r is any permutation of {1, . . . , n}. In particular, we can choose 7r such 
that for given U\, . . . ,U n , = U^u) , i = 1 ,...,n. 12 Then (2.1.18) turns 
into 
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This proves the proposition. □ 

12 We give an argument to make this step in the proof more transparent. Since (Ui) 
and ( Xi ) are independent, it is possible to define ((Ui), (X \)) on the product space 
12 1 x h?2 equipped with suitable u-helds and probability measures, and such that 
(Ui) lives on 12i and (Xi) on (122). While conditioning on ui = U\(u>i ), ... ,u„ = 
U n (oJ i), wi € l2i, choose the permutation 7r = tv(u>i) of {1, ... , n} with u n (i tW1 ) < 
• • • < u^( nia)1 ), and then with probability 1, 

P({co 2 : (Xi(w2),...,Jf n (w2)) € A}) = 

P({^7 2 . (X n (i jU ji ) (^ 2 ) , ■ - * , X 7r ^ rLU j 1 ^ (^ 2 ))} £= A | U\ (iO \ ) — U l7 ...,Un (uu) — Un) • 
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It is clear that Proposition 2.1.16 can be extended to the case when (T)) is the 
arrival sequence of an inhomogeneous Poisson process. The interested reader 
is encouraged to go through the steps of the proof in this more general case. 

Proposition 2.1.16 has a multitude of applications. We give one of them 
and consider more in the exercises. 

Example 2.1.18 (Continuation of the shot noise Example 2.1.15) 

In Example 2.1.15 we considered the stochastically discounted random sums 

N(t) 

S(t) = J2 e ~ r(t ~ Ti) x i- (2.1.19) 

i=l 

According to Proposition 2.1.16, we have 

N(t) N(t) 

S(t) = J2e~ r{t ~ tUi) Xi= ~ rtUi Xi, (2.1.20) 

2 = 1 2=1 



where (A,;), (Ui) and N are mutually independent. Here we also used the 
fact that (1 — Ui) and (Ui) have the same distribution. The structure of the 
random sum (2.1.19) is more complicated than the structure of the right-hand 
expression in (2.1.20) since in the latter sum the summands are independent 
of N(t) and iid. For example, it is an easy matter to calculate the mean and 
variance of the expression on the right-hand side of (2.1.20) whereas it is a 
rather tedious procedure if one starts with (2.1.19). For example, we calculate 



ES(t) = E 



N(t) \ 


/ N(t ) 
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E J2 e ~ rtUiXi 
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= E [N(t)E (e~ rtUl X i)] 

= EN(t) Ee~ rtUl EXi = Ar -1 (l - e ~ rt ) EX 1 . 



Compare with the expectation in the Cramer-Lundberg model (r = 0): 
ES(t) = XtEXi. □ 



Comments 

The order statistics property of a Poisson process can be generalized to Poisson 
processes with points in abstract spaces. We give an informal discussion of 
these processes in Section 2.1.8. In Exercise 20 on p. 58 we indicate how the 
“order statistics property” can be implemented, for example, in a Poisson 
process with points in the unit cube of R d . 




38 



2 Models for the Claim Number Process 



2.1.7 A Discussion of the Arrival Times of the Danish Fire 
Insurance Data 1980-1990 

In this section we want to illustrate the theoretical results of the Poisson 
process by means of the arrival process of a real-life data set: the Danish 
fire insurance data in the period from January 1, 1980, until December 31, 
1990. The data were communicated to us by Mette Rytgaard and are available 
under www.math.ethz.ch/~mcneil. There is a total of n = 2 167 observations. 
Here we focus on the arrival process. In Section 3.2, and in particular in 
Example 3.2.11, we study the corresponding claim sizes. 

The arrival and the corresponding inter-arrival times are plotted in Fig- 
ure 2.1.19. Together with the arrival times we show the straight line f(t) = 
1.85 1. The value A = n/T n = 1/1.85 is the maximum likelihood estimator of 
A under the hypothesis that the inter-arrival times Wi are iid Exp(A). 





Figure 2.1.19 Left: The arrival times of the Danish fire insurance data 1980—1990. 
The solid straight line has slope 1.85 which is estimated as the overall sample mean 
of the inter-arrival times. Since the graph of ( T n ) lies above the straight line an 
inhomogeneous Poisson process is more appropriate for modeling the claim number 
in this portfolio. Right: The corresponding inter-arrival times. There is a total of 
n = 2 167 observations. 



In Table 2.1.20 we summarize some basic statistics of the inter-arrival 
times for each year and for the whole period. Since the reciprocal of the 
annual sample mean is an estimator of the intensity, the table gives one the 
impression that there is a tendency for increasing intensity when time goes by. 
This phenomenon is supported by the left graph in Figure 2.1.21 where the 
annual mean inter-arrival times are visualized together with moving average 
estimates of the intensity function A (f). The estimate of the mean inter-arrival 
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time at t = i is defined as the moving average 13 



min(n,i+m) 

(A(i)) -1 = (2m + 1) _1 Y W 3 for to = 50. (2.1.21) 

j— max (1,2— m) 



The corresponding estimates for A (i) can be interpreted as estimates of the 
intensity function. There is a clear tendency for the intensity to increase over 
the last years. This tendency can also be seen in the right graph of Fig- 
ure 2.1.21. Indeed, the boxplots 14 of this figure indicate that the distribution 
of the inter-arrival times of the claims is less spread towards the end of the 
1980s and concentrated around the value 1 in contrast to 2 at the beginning 
of the 1980s. Moreover, the annual claim number increases. 
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Table 2.1.20 Basic statistics for the Danish fire inter-arrival times data. 

Since we have gained statistical evidence that the intensity function of 
the Danish fire insurance data is not constant over 11 years, we assume in 
Figure 2.1.22 that the arrivals are modeled by an inhomogeneous Poisson 
process with continuous mean value function. We assume that the intensity is 
constant for every year, but it may change from year to year. Hence the mean 
value function /j(t) of the Poisson process is piecewise linear with possibly 
different slopes in different years; see the top left graph in Figure 2.1.22. We 

13 Moving average estimates such as (2.1.21) are proposed in time series analysis in 
order to estimate a deterministic trend which perturbs a stationary time series. 
We refer to Brockwcll and Davis [16] and Priestley [63] for some theory and 
properties of the estimator (A(i)) -1 and related estimates. More sophisticated 
estimators can be obtained by using kernel curve estimators in the regression 
model Wi = (A(i)) -1 + £; for some smooth deterministic function A and iid or 
weakly dependent stationary noise (si). We refer to Fan and Gijbels [31] and 
Gasser et al. [33] for some standard theory of kernel curve estimation; see also 
Muller and Stadtiniiller [59]. 

14 The boxplot of a data set is a means to visualize the empirical distribution of 
the data. The middle part of the plot (box) indicates the median * 0 . 50 , the 25% 
and 75% quantiles (* 0.25 and * 0 . 75 ) of the data. The “whiskers” of the data are 
the lines * 0.50 ± 1.5 (* 0.75 — * 0 . 25 ). Values outside the whiskers (“outliers”) are 
plotted as points. 
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Figure 2.1.21 Left, upper graph: The piecewise constant function represents 
the annual expected inter-arrival time between 1980 and 1990. The length of each 
constant piece is the claim number in the corresponding year. The annual estimates 
are supplemented by a moving average estimate (A(f)) -1 defined in (2.1.21). 
Left, lower graph: The reciprocals of the values of the upper graph which can be 
interpreted as estimates of the Poisson intensity. There is a clear tendency for the 
intensity to increase over the last years. Right: Boxplots for the annual samples of 
the inter-arrival times (No 1-11) and the sample over 11 years (No 12). 



choose the estimated intensities presented in Table 2.1.20 and in the left graph 
of Figure 2.1.21. We transform the arrivals T n into fi(T n ). According to the 
theory in Section 2.1.3, one can interpret the points /i(T ra ) as arrivals of a 
standard homogeneous Poisson process. This is nicely illustrated in the top 
right graph of Figure 2.1.22, where the sequence (^i( T n )) is plotted against 
n. The graph is very close to a straight line, in contrast to the left graph in 
Figure 2.1.19, where one can clearly see the deviations of the arrivals T n from 
a straight line. 

In the left middle graph we consider the histogram of the time changed ar- 
rival times yi{T n ). According to the theory in Section 2.1.6, the arrival times of 
a homogeneous Poisson can be interpreted as a uniform sample on any fixed 
interval, conditionally on the claim number in this interval. The histogram 
resembles the histogram of a uniform sample in contrast to the middle right 
graph, where the histogram of the Danish fire arrival times is presented. How- 
ever, the left histogram is not perfect either. This is due to the fact that the 
data T n are integers, hence the values ti{T n ) live on a particular discrete set. 

The left bottom graph shows a moving average estimate of the intensity 
function of the arrivals ix(T n ). Although the function is close to 1 the esti- 
mates fluctuate wildly around 1. This is an indication that the process might 
not be Poisson and that other models for the arrival process could be more 
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appropriate; see for example Section 2.2. The deviation of the distribution of 
the inter-arrival time /r(T„) — /r(T„_i), which according to the theory should 
be iid standard exponential, can also be seen in the right bottom graph in Fig- 
ure 2.1.22, where a QQ-plot 15 of these data against the standard exponential 
distribution is shown. The QQ-plot curves down at the right. This is a clear 
indication of a right tail of the underlying distribution which is heavier than 
the tail of the exponential distribution. These observations raise the question 
as to whether the Poisson process is a suitable model for the whole period of 
11 years of claim arrivals. 

A homogeneous Poisson process is a suitable model for the arrivals of the 
Danish fire insurance data for shorter periods of time such as one year. This 
is illustrated in Figure 2.1.23 for the 166 arrivals in the period January 1 - 
December 31,1980. 

As a matter of fact, the data show a clear seasonal component. This can 
be seen in Figure 2.1.24, where a histogram of all arrivals modulo 366 is given. 
Hence one receives a distribution on the integers between 1 and 366. Notice 
for example the peak around day 120 which corresponds to fires in April-May. 
There is also more activity in summer than in early spring and late fall, and 
one observes more fires in December and January with the exception of the 
last week of the year. 



2.1.8 An Informal Discussion of Transformed and Generalized 
Poisson Processes 

Consider a Poisson process N with claim arrival times T,; on [0, oo) and mean 
value function independent of the iid positive claim sizes X, with distri- 
bution function F. In this section we want to learn about a procedure which 
allows one to merge the Poisson claim arrival times T, and the iid claim sizes 
Xi in one Poisson process with points in R 2 . 

Define the counting process 



N(b) 

M(a,b) = #{i> 1 : X t < a , T t < b} = ^ I(o, a ] (JQ) , a , b > 0 . 

i=l 

We want to determine the distribution of M(a,b). For this reason, recall the 

characteristic function 16 of a Poisson random variable M ~ Pois(7): 

15 The reader who is unfamiliar with QQ-plots is referred to Section 3.2.1. 

16 In what follows we work with characteristic functions because this notion is de- 
fined for all distributions on R. Alternatively, we could replace the characteris- 
tic functions by moment generating functions. However, the moment generating 
function of a random variable is well-defined only if this random variable has 
certain finite exponential moments. This would restrict the class of distributions 
we consider. 




42 



2 Models for the Claim Number Process 





Figure 2.1.22 Top left: The estimated mean value function p{t) of the Danish fire 
insurance arrivals. The function is piecewise linear. The slopes are the estimated 
intensities from Table 2.1.20. Top right: The transformed arrivals /.i(T n ). Compare 
with Figure 2.1.19. The histogram of the values ti{T n ) (middle left) resembles a 
uniform density, whereas the histogram of the T„ ’s shows clear deviations from it 
(middle right). Bottom left: Moving average estimate of the intensity function cor- 
responding to the transformed sequence (p,(T n )). The estimates fluctuate around the 
value 1. Bottom right: QQ-plot of the values p(T n ) — p(T n - 1 ) against the standard 
exponential distribution. The plot curves down at the right end indicating that the 
values come from a distribution with tails heavier than exponential. 
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Figure 2.1.23 The Danish fire insurance arrivals from January 1, 1980, until De- 
cember 31, 1980. The inter-arrival times have sample mean A -1 = 2.19. Top left: The 
renewal process N(t) generated by the arrivals ( solid boldface curve). For compari- 
son, one sample path of a homogeneous Poisson process with intensity A = (2.19) -1 
is drawn. Top right: The histogram of the inter-arrival times. For comparison, the 
density of the Exp(A) distribution is drawn. Bottom left: QQ-plot for the inter- 
arrival sample against the quantiles of the Exp(A) distribution. The fit of the data 
by an exponential Exp(A) is not unreasonable. However, the QQ-plot indicates a 
clear difference to exponential inter-arrival times: the data come from an integer- 
valued distribution. This deficiency could be overcome if one knew the exact claim 
times. Bottom right: The ratio T n /n as a function of time. The values cluster around 
A -1 = 2.19 which is indicated by the constant line. For a homogeneous Poisson pro- 
cess, T n /n — > A -1 by virtue of the strong law of large numbers. For an iid Exp(A) 
sample W\, . . . ,W n , A = n/T„ is the maximum likelihood estimator of X. If one 
accepts the hypothesis that the arrivals in 1980 come from a homogeneous Poisson 
process with intensity A = (2.19) -1 , one would have an expected inter-arrival time 
of 2.19, i.e., roughly every second day a claim occurs. 
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Figure 2.1.24 Histogram of all arrival times of the Danish fire insurance claims 
considered as a distribution on the integers between 1 and 366. The bars of the his- 
togram correspond to the weeks of the year. There is a clear indication of seasonality 
in the data. 



Ee itM = J2 e Un P ( M = n) = e itn e “ 7 \ = e , teR. 



(2.1.22) 

We know that the characteristic function of a random variable M determines 
its distribution and vice versa. Therefore we calculate the characteristic func- 
tion of M(a,b). A similar argument as the one leading to (2.1.22) yields 



Ee itM{a ’ b) = E E exp j i t W] (**) > N(b) 



E [(.Eexp {tt/ (0 a] (A 1 )}) JV(fc) 
£([l-F(o)+f(o)ef (l) ) 



= e - M (6)F(o)(l-e“) 



We conclude from (2.1.22) and (2.1.23) that M(a, b ) ~ Pois(F(a) n(b)). Using 
similar characteristic function arguments, one can show that 
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Figure 2.1.25 1000 points ( Ti,Xi ) of a two-dimensional Poisson process, where 
(' Ti ) is the sequence of the the arrival times of a homogeneous Poisson process with 
intensity 1 and (A';) is a sequence of iid claim sizes, independent of (Ti). Left: 
Standard exponential claim sizes. Right: Pareto distributed claim sizes with P(Xi > 
x) = x~ 4 , x > 1. Notice the difference in scale of the claim sizes! 



• The increments 

M((x, x + h] x (t, t + s]) 

= #{* > 1 : ( Xi,Ti ) € (x, x + h\ x (t, t + s]} , x, t > 0 , h, s > 0 , 

are Pois(F(cc, x + h] p,(t,, t + s]) distributed. 

• For disjoint intervals Ai = ( Xi,Xi + hi] x (U,t. i + Sj], i = 1 the 

increments M(A{), i = 1 , ... ,n, are independent. 

From measure theory, we know that the quantities F(x, x + h] n(t, t + s] de- 
termine the product measure 7 = F x /i on the Borel er-field of [0, 00) , where 
F denotes the distribution function as well as the distribution of Xi and /r is 
the measure generated by the values n(a, b], 0 < a < b < 00. This is a conse- 
quence of the extension theorem for measures; cf. Billingsley [13]. In the case 
of a homogeneous Poisson process, /z = ALeb, where Leb denotes Lebesgue 
measure on [0,oo). 

In analogy to the extension theorem for deterministic measures, one can 
find an extension M of the random counting variables M(A), A = (x,x+h] x 
(t, t + s], such that for any Borel set 17 A C [0, oo) 2 , 

M(A) = #{i > 1 : (Xi,Ti) Pois( 7 (Al)) , 

and for disjoint Borel sets A \, . . ., A n C [0,oo) 2 , M(A\), . . . ,M(A n ) are in- 
dependent. We call 7 = F x fi the mean measure of M, and M is called a 



17 For A with mean measure 7 (A) = 00 , we write M(A ) = 00 . 
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Poisson process or a Poisson random measure with mean measure 7 , denoted 
M ~ PRM( 7 ). Notice that M is indeed a random counting measure on the 
Borel er-field of [0, oo) 2 . 

The embedding of the claim arrival times and the claim sizes in a Poisson 
process with two-dimensional points gives one a precise answer as to how many 
claim sizes of a given magnitude occur in a fixed time interval. For example, 
the number of claims exceeding a high threshold it, say, in the period (a, b] of 
time is given by 

M((u, 00 ) x (a, 6]) = ff{i > 1 : Xi > u , Tj G (a, 6]} . 

This is a Pois((l — F(u)) n(a, &]) distributed random variable. It is independent 
of the number of claims below the threshold u occurring in the same time 
interval. Indeed, the sets (u,oo) x (a, b] and [0,it] x (a, b] are disjoint and 
therefore M((u,oo) x (a, b}) and M([0,u] x (a,b\) are independent Poisson 
distributed random variables. 

In the previous sections 18 we used various transformations of the arrival 
times Ti of a Poisson process N on [0, 00 ) with mean measure is, say, to derive 
other Poisson processes on the interval [0,oo). The restriction of processes to 
[0,oo) can be relaxed. Consider a measurable set E Cl and equip E with 
the a - field £ of the Borel sets. Then 

N(A) = #{i > 1 : Ti G A} , A € £ , 

defines a random measure on the measurable space (E,£). Indeed, N(A) = 
N(A,u>) depends on oj G 17 and for fixed u>, N(-,co) is a counting measure on 
£. The set E is called the state space of the random measure N. It is again 
called a Poisson random measure or Poisson process with mean measure is 
restricted to E since one can show that N(A) ~ Pois(^(A)) for A G £ , and 
N{Ai), i = 1 , ,n, are mutually independent for disjoint Ai G £. The notion 
of Poisson random measure is very general and can be extended to abstract 
state spaces E. At the beginning of the section we considered a particular 
example in E = [0, oo) 2 . The Poisson processes we considered in the previous 
sections are examples of Poisson processes with state space E = [0, 00 ). 

One of the strengths of this general notion of Poisson process is the fact 
that Poisson random measures remain Poisson random measures under mea- 
surable transformations. IndeedAet ip : E E be such a transformation and 
E be equipped with the cr-field £. Assume N is PRM(i') on E with points T,;. 
Then the points 'f/>(X)j are in E and, for A G £ , 

N*{A) = #{i> 1 : ip(Ti) G A} = #{i > 1 : T t G i/j~\ A )} = Nty-^A )) , 

where ip -1 (A) = {x G E : i/j(x) G A} denotes the inverse image of A 
which belongs to £ since if is measurable. Then we also have that N^(A) ~ 
Pois(V(i/> - 1 (A))) since EN^(A) = EN(ip~ 1 (A)) = lsfif -1 ^)). Moreover, 

18 See, for example, Section 2.1.3. 
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since disjointness of A\,...,A n in £ implies disjointness of (Ai) , . . . , 
in £, it follows that N^(Ai ), . . . , N^(A n ) are independent, by the 
corresponding property of the PRM N. We conclude that ~ PRM(^(r/) _1 )). 



- 2-101234 

log(t) 

Figure 2.1.26 Sample paths of the Poisson processes with arrival times exp{T)} 
(bottom dashed curve), T) (middle dashed curve) and logT) (top solid curve). The 
Ti ’ s are the arrival times of a standard homogeneous Poisson process. Time is on 
logarithmic scale in order to visualize the three paths in one graph. 



Example 2.1.27 (Measurable transformations of Poisson processes remain 
Poisson processes) 

(1) Let N be a Poisson process on [0,oo) with mean value function Ji and 
arrival times 0 < Tj < X 2 < • • • . Consider the transformed process 

N(t) = ff{i >1:0 < Ti — a < t} , 0 < t, < b — a , 

for some interval [a, 6] C [0, 00 ), where ip( x ) = x—a is clearly measurable. This 
construction implies that N(A) = ff{i > 1 : ^{Tj) € A} = 0 for A C [0, b— a]°, 
the complement of [0, b — a}. Therefore it suffices to consider N on the Borel 
sets of [0,6 — a]. This defines a Poisson process on [a,b\ with mean value 
function p(t) = p(t) — t £ [a, 6]. 

(2) Consider a standard homogeneous Poisson process on [0,oo) with arrival 
times 0 < Xj < T 2 < ■ ■ ■ . We transform the arrival times with the measurable 
function ip{ x ) = logx. Then the points (log T)) constitute a Poisson process 
N on R. The Poisson measure of the interval (a, 6] for a < b is given by 

N(a, 6] = #{i > 1 : log(T, ; ) £ (a, 6]} = #{* > 1 : T) £ (e °, e 6 ]} . 

This is a Pois(e & — e a ) distributed random variable, i.e., the mean measure 
of the interval (a, 6] is given by e b — e °. 
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Alternatively, transform the arrival times T) by the exponential function. The 
resulting Poisson process M is defined on [1, oo). The Poisson measure of the 
interval (a, b] C [1, oo) is given by 

M(a,b ] = #{i > 1 : e Ti G (a, b]} = #{* > 1 : T, ; e (log a, log 6]} . 

This is a Pois(log(&/a)) distributed random variable, i.e., the mean measure of 
the interval (a, b) is given by log(6/a). Notice that this Poisson process has the 
remarkable property that M(ca,cb] for any c > 1 has the same Pois(log(6/a)) 
distribution as M(o, b ]. In particular, the expected number of points exp{Ti} 
falling into the interval ( ca , cb\ is independent of the value c > 1. This is 
somewhat counterintuitive since the length of the interval (ca, cb\ can be ar- 
bitrarily large. However, the larger the value c the higher the threshold ca 
which prevents sufficiently many points exp{Xj} from falling into the interval 
(ca, cb], and on average there are as many points in (ca, cb] as in (a, b ]. □ 

Example 2.1.28 (Construction of transformed planar PRM) 

Let (Tj) be the arrival sequence of a standard homogeneous Poisson process 
on [0,oo), independent of the iid sequence (A,) with common distribution 
function F. Then the points (T), A,;) constitute a PR.M(z') N with state space 
E = [0, oo) x R. and mean measure v = Leb x F\ see the discussion on p. 45. 

After a measurable transformation ip : M 2 — > R 2 the points ip(Ti,Xi ) 
constitute a PRM with state space = {ip(t,x) : (t,x) € E} and 
mean measure i ty(A) = u(ip~ 1 (A)) for any Borel set A C E^. We choose 
ip(t,x) = (cos(2 nx) , sin(2 7r cc)) for some a/0, i.e., the PRM N ^ has 

points Y i = T~ 1 ^ Q (cos(2 7r A,;),sin(2 7r Aj)). In Figure 2.1.30 we visualize the 
points Y, of the resulting PRM for different choices of a and distribution 
functions F of X\ . 

Planar PRMs such as the ones described above are used, among others, 
in spatial statistics (see Cressie [24]) in order to describe the distribution 
of random configurations of points in the plane such as the distribution of 
minerals, locations of highly polluted spots or trees in a forest. The particular 
PRM N ^ and its modifications are major models in multivariate extreme 
value theory. It describes the dependence of extremes in the plane and in 
space. In particular, it is suitable for modeling clustering behavior of points 
Yj far away from the origin. See Resnick [64] for the theoretical background 
on multivariate extreme value theory and Mikosch [58] for a recent attempt 
to use N ^ for modeling multivariate financial time series. □ 

Example 2.1.29 (Modeling arrivals of Incurred But Not Reported (IBNR) 
claims) 

In a portfolio, the claims are not reported at their arrival times Tj, but with 
a certain delay. This delay may be due to the fact that the policyholder is 
not aware of the claim and only realizes it later (for example, a damage in 
his/her house), or that the policyholder was injured in a car accident and did 
not have the opportunity to call his agent immediately, or the policyholder’s 
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Figure 2.1.30 Poisson random measures in the plane. 

Top left: 2 000 points of a Poisson random measure with points ( Ti,Xi ), where 
( Ti ) is the arrival sequence of a standard homogeneous Poisson process on [0, oo), 
independent of the iid sequence (Xf) with X\ ~ U(0, 1). The PRM has mean measure 
v = Leb x Leb on [0, oo) x (0, 1). 

After the measurable transformation i/>(t,x) = (cos(2 7ra:),sin(2 7ra:)) for some 

cc^O the resulting PRM N ^ has points Y; = T) _1 ^“(cos(2 7r A'i), sin(2 7tA1)). 

Top right: The points of the process N ^ for a = 5 and iid U(0, 1) uniform X f ’s. 
Notice that the spherical part (cos(2 7rX;),sin(2 7r A'i)) ofYi is uniformly distributed 
on the unit circle. 

Bottom left: The points of the process N ^ with a = —5 and iid U(0, 1) uniform X f ’s. 
Bottom right: The points of the process N ^ for a = 5 with iid X f ~ Pois(10). 
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flat burnt down over Christmas, but the agent was on a skiing vacation in 
Switzerland and could not receive the report about the fire, etc. 

We consider a simple model for the reporting times of IBNR claims: the 
arrival times T, of the claims are modeled by a Poisson process N with mean 
value function p and the delays in reporting by an iid sequence (V)) of positive 
random variables with common distribution F. Then the sequence (T, + Vi) 
constitutes the reporting times of the claims to the insurance business. We 
assume that (Vi) and (Ti) are independent. Then the points (T t , Vi) constitute 
a PR.M(^) with mean measure v = p x F. By time t, N(t) claims have 
occurred, but only 



N(t) 

iVlBNR(t) = + V i) = #{* > 1 ; T i +V<t} 

have been reported. The mapping ip(t,v) = t + v is measurable. It transforms 
the points (T,;,!/)) of the PRM(i/) into the points Ti + V \ of the PR.M 
with mean measure of a set A given by u^(A) = v('ij)~ 1 (A)). In particular, 
^Vibnr(s) = iMM) is Pois(^([0, s])) distributed. We calculate the mean 
value 



v$({ 0, s]) = (/x x F){(t, v) : 0 < t + v < s} 



/•S /*S — t /* s 

= / dF(v) dn(t) = / F(s — t) dfi(t ) . 

J t — 0 J 0 J 0 

If N is homogeneous Poisson with intensity A > 0, /x = ALeb, and then 



^([0,s]) = A / F(t)dt = \s — \ / F(t)dt, 



(2.1.24) 



where F = 1 — F is the tail of the distribution function F. The second term in 
(2.1.24) converges to the value \EV\ = A J 0 °° F(t)dt as s — » oo. The delayed 
claim numbers jWbnr(s) constitute an inhomogeneous Poisson process on 
[0,oo) whose mean value function differs from EN(s) = As by the value 
A f ( j F(t) dt. If EX i < oo and ft, > 0 is fixed, the difference of the mean values 
of the increments N(s, s+h) and !Vibnr(s, s+ft] is asymptotically negligible. □ 



Comments 

The Poisson process is one of the most important stochastic processes. For the 
abstract understanding of this process one would have to consider it as a point 
process, i.e., as a random counting measure. We have indicated in Section 2.1.8 
how one has to approach this problem. As a matter of fact, various other 
counting processes such as the renewal process treated in Section 2.2 are 
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( 20 40 6 ) 80 100 0 50 100 150 200 250 



t t 

Figure 2.1.31 Incurred But Not Reported claims. We visualize one sample of a 
standard homogeneous Poisson process with n arrivals Ti ( top boldface graph ) and 
the corresponding claim number process for the delayed process with arrivals Ti + Vi, 
where the Vi ’s are iid Pareto distributed with distribution P(V i > x) = x~ 2 , x > 1, 
independent of (Ti). Top: n = 30 (left) and n = 50 (right). Bottom: n = 100 (left) 
and n = 300 (right). As explained in Example 2.1.29, the sample paths of the claim 
number process differ from each other approximately by the constant value EV i . For 
sufficiently large t, the difference is negligible compared to the expected claim number. 
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approximated by suitable Poisson processes in the sense of convergence in 
distribution. Therefore the Poisson process with nice mathematical properties 
is also a good approximation to various real-life counting processes such as 
the claim number process in an insurance portfolio. 

The treatment of general Poisson processes requires more stochastic pro- 
cess theory than available in this course. For a gentle introduction we refer 
to Embrechts et al. [29], Chapter 5; for a rigorous treatment at a moderate 
level, Resnick’s [65] monograph or Kingman’s book [50] are good references. 
Resnick’s monograph [64] is a more advanced text on the Poisson process with 
various applications to extreme value theory. See also Daley and Vere- Jones 
[25] or Kallenberg [48] for some advanced treatments. 

Exercises 

Sections 2. 1.1-2. 1.2 

(1) Let N = (N(t))t> o be a Poisson process with continuous intensity function 
(A(f))t>o- 

(a) Show that the intensities A n ,n+k(t), n,k > 0 and t > 0, of the Markov process 
N with transition probabilities Pn,n+k(s,t) exist, i.e., 

\ r Pn,n+k(t,t h) , 

Xn,n+k(t) = Inn , n > 0 , k > 1 , 

h\,0 n 

and that they are given by 

( X(t) , k = 1 , 

A „,„+*,(*) = (2.1.25) 

[ 0 , k > 2 . 

(b) What can you conclude from p n ,n+k ( t , t + h) for h small about the short term 
jump behavior of the Markov process Al? 

(c) Show by counterexample that (2.1.25) is in general not valid if one gives up the 
assumption of continuity of the intensity function A(t). 

(2) Let N = (N(t))t>o be a Poisson process with continuous intensity function 
(A(f))t>o- By using the properties of N given in Definition 2.1.1, show that the 
following properties hold: 

(a) The sample paths of N are non- decreasing. 

(b) The process N does not have a jump at zero with probability 1. 

(c) For every fixed t, the process N does not have a jump at t with probability 1. 
Does this mean that the sample paths do not have jumps? 

(3) Let Al be a homogeneous Poisson process on [0, oo) with intensity A > 0. Show 
that for 0 < fi < t < f 2 , 

lirri P(N(ti — h ,t — h] = 0 , N(t — h, t] = 1 , N(t, t 2 ] =0 | N(t — h ,t] >0) 

hi 0 

_ e -A (t-ti) e -A (t 2 -t) 

Give an intuitive interpretation of this property. 




2.1 The Poisson Process 



53 



(4) 



(5) 

(a) 

(b) 

(6) 



(7) 

(a) 

(b) 

(c) 

(d) 



(8) 



Let Ah, . . . , N n be independent Poisson processes on [0, oo) defined on the same 
probability space. Show that Ni + ■ ■ ■ + N n is a Poisson process and determine 
its mean value function. 



This property extends the well-known property that the sum Mi + M 2 of two 
independent Poisson random variables Mi ~ Pois(Ai) and M 2 ~ Pois(A 2 ) is 
Pois(Ai + A 2 ). We also mention that a converse to this result holds. Indeed, sup- 
pose M = M\ + M 2 , M ~ Pois(A) for some A > 0 and Mi, M 2 are independent 
non-negative random variables. Then both Mi and M 2 are necessarily Pois- 
son random variables. This phenomenon is referred to as Raikov’s theorem ; see 
Lukacs [54], Theorem 8.2.2. An analogous theorem can be shown for so-called 
point processes which are counting processes on [0, 00 ), including the Poisson 
process and the renewal process. Indeed, if the Poisson process N has represen- 
tation N = Ah + N 2 for independent point processes Ah, N 2 , then Ah and N 2 
are necessarily Poisson processes. 

Consider the total claim amount process S in the Cramer-Lundberg model. 
Show that the total claim amount S(s,t] in (s,t] for s < t, i.e. , S^s,!] = S(t) — 
S(s), has the same distribution as the total claim amount in [0,f — s], i.e., 
S(t — s). 

Show that, for every 0 = to < ti <•••< t n and n > 1, the random vari- 
ables S(ti,t 2 ] , . . . , S(tn-i, t n ] are independent. Hint: Calculate the joint 

characteristic function of the latter random variables. 

For a homogeneous Poisson process N on [0, 00 ) show that for 0 < s < t, 



P(N(s) 



k | N(t)) = 




if k < N{t ) , 
if k > N(t) . 



Section 2.1.3 

Let N be a standard homogeneous Poisson process on [0, 00 ) and N a Poisson 
process on [0, 00 ) with mean value function p. 

Show that Ah = (N(p(t)))t > 0 is a Poisson process on [0, 00 ) with mean value 
function p. 

Assume that the inverse p~ x of p exists, is continuous and liirit^oo p(t) = 00 . 
Show that iVi(t) = A r (/r^ 1 (t)) defines a standard homogeneous Poisson process 
on [0, 00 ). 

Assume that the Poisson process N has an intensity function A. Which condition 
on A ensures that /r~ 1 (t) exists for t > 0 ? 

Let / : [0, 00 ) — > [0, 00 ) be a non-decreasing continuous function with /( 0) = 0. 
Show that 



N f (t) = N(f(t)), t> 0, 

is again a Poisson process on [0,oo). Determine its mean value function. 

Sections 2. 1.4-2. 1.5 __ 

The homogeneous Poisson process N with intensity A > 0 can be written as a 
renewal process 

N{t) = #{* > 1 : fi < t} , t > 0 , 

where T„ = Wi + ■ ■ ■ + W„ and ( W n ) is an iid Exp (A) sequence. Let A be a 
Poisson process with mean value function p which has an a.e. positive continuous 
intensity function A. Let 0 < T\ < T 2 < • • • be the arrival times of the process N . 
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(a) Show that the random variables f^ n+1 A (s) ds are iid exponentially distributed. 

(b) Show that, with probability 1, no multiple claims can occur, i.e., at an ar- 
rival time Ti of a claim, N(T) — N(Ti~) = 1 a.s. and P(N(Tt ) — N(Ti~) > 
1 for some i) = 0 . 

(9) Consider a homogeneous Poisson process N with intensity A > 0 and arrival 
times Ti. 

(a) Assume the renewal representation N(t) = ff{i > 1 : Ti < £}, £ > 0, for N, 
i.e., To = 0, Wi = Ti — T_i are iid Exp(A) inter-arrival times. Calculate for 
0 < fi < £ 2 , 

P(T\ < £ 1 ) and P(T\ < ti , T 2 < £ 2 ). (2.1.26) 

(b) Assume the properties of Definition 2.1.1 for N . Calculate for 0 < £1 < £ 2 , 

P{N(ti) > 1) and P(N(ti) > 1 , AT(£ 2 ) > 2) . (2.1.27) 

(c) Give reasons why you get the same probabilities in (2.1.26) and (2.1.27). 

(10) Consider a homogeneous Poisson process on [0,oo) with arrival time sequence 
(Ti) and set To = 0. The inter-arrival times are defined as Wi = T) — Tj_i, i > 1. 

(a) Show that Ti has the forgetfulness property , i.e., P(T\ > £ + s \ T\ > £) = 
P(T\ > s), t,s> 0. 

(b) Another version of the forgetfulness property is as follows. Let Y > 0 be inde- 
pendent of Ti and Z be a random variable whose distribution is given by 

P(Z > z) = P(Ti > Y + z | Ti > Y ) , z>0. 

Then Z and Ti have the same distribution. Verify this. 

(c) Show that the events {Wi < W 2 } and {min(Wi, W 2 ) > x } are independent. 

(d) Determine the distribution of m n = min(Ti, T 2 — Ti, . . . , T n — T„_ 1 ). 

(11) Suppose you want to simulate sample paths of a Poisson process. 

(a) How can you exploit the renewal representation to simulate paths of a homoge- 
neous Poisson process? 

(b) How can you use the renewal representation of a homogeneous Poisson N to 
simulate paths of an inhomogeneous Poisson process? 

Sections 2.1.6 

(12) Let Ui, . . . ,U n be an iid U(0, 1) sample with the corresponding order statistics 
?7(i) < • • • < {/(„) a.s. Let (Wi) be an iid sequence of Exp(A) distributed ran- 
dom variables and T„ = Wi + ■ ■ ■ + W„ the corresponding arrival times of a 
homogeneous Poisson process with intensity A. 

(a) Show that the following identity in distribution holds for every fixed n > 1: 

{U w ,...,U {n) )± . (2.1.28) 

\Jn+ 1 -tn+1 / 

Hint: Calculate the densities of the vectors on both sides of (2.1.28). The density 
of the vector 

[(Ti, • • • , T„,)/T„ + i, T n +i] 

can be obtained from the known density of the vector (Ti, . . . , T n +i). 

(b) Why is the distribution of the right-hand vector in (2.1.28) independent of A? 
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(c) 



(13) 



(14) 



(a) 

(b) 



(15) 

(a) 

(b) 

(c) 

(16) 



Let Ti be the arrivals of a Poisson process on [0, oo) with a.e. positive inten- 
sity function A and mean value function /r. Show that the following identity in 
distribution holds for every fixed n > 1: 

£ ( /fffQ Mgn) 

Let W i , . . . , W„ be an iid Exp(A) sample for some A > 0. Show that the ordered 
sample l-L'm < • • • < TU( n ) has representation in distribution: 

(W (1)l ...,W (n) ) 




« W n , W n -r W n , Wn_i , , W 2 

\ n n n — 1 n n — 1 2 

W n Wn-1 Wl\ 

n n- 1 H h 1 ) ' 



Hint: Use a density transformation starting with the joint density of Vbi, . . . , W„ 
to determine the density of the right-hand expression. 

Consider the stochastically discounted total claim amount 



N(t) 

s(t) = Y. e ~ rTiXi ’ 



i = 1 



where r > 0 is an interest rate, 0 < 7\ < T 2 < • • • are the claim arrival times, 
defining the homogeneous Poisson process N(t) = #{* > 1 : Ti < t}, t > 0, with 
intensity A > 0, and (X,) is an iid sequence of positive claim sizes, independent 
of (Ti). 

Calculate the mean and the variance of S(t) by using the order statistics prop- 
erty of the Poisson process N. Specify the mean and the variance in the case 
when r = 0 (Cramer-Lundberg model). 

Show that S(t) has the same distribution as 



N(t) 

— rt \ ' rTo v 

e 2_^ e • 

i= 1 

Suppose you want to simulate sample paths of a Poisson process on [0, T] for 
T > 0 and a given continuous intensity function A, by using the order statistics 
property. 

How should you proceed if you are interested in one path with exactly n jumps 
in [0, Tj? 

How would you simulate several paths of a homogeneous Poisson process with 
(possibly) different jump numbers in [0,T]? 

How could you use the simulated paths of a homogeneous Poisson process to 
obtain the paths of an inhomogeneous one with given intensity function? 

Let (Ti) be the arrival sequence of a standard homogeneous Poisson process N 
and a € (0, 1). 
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(a) Show that the infinite series 

OO 

X a = Y^Tr 1/a (2.1.29) 

i= 1 

converges a.s. Hint: Use the strong law of large numbers for (T„). 

(b) Show that 

JV(t) 

X N(t) = Y T~ 1/a as t * oo. 

i= 1 

Hint: Use Lemma 2.2.6. 

(c) It follows from standard limit theory for sums of iid random variables (see 
Feller [32], Theorem 1 in Chapter XVII. 5) that for iid U(0, 1) random variables 

Ui, 

n 

n~ 1/a Y u i 1/a z <* . (2.1.30) 

i=l 

where Z a is a positive random variable with an a-stable distribution determined 
by its Laplace-Stieltjes transform E exp{—s Z a j = exp{— cs“} for some c > 0, 
all s > 0. See p. 182 for some information about Laplace-Stieltjes transforms. 
Show that X a = c Z a for some positive constant c' > 0. 

Hints: (i) Apply the order statistics property of the homogeneous Poisson process 
to A'jv(t) to conclude that 

Nit) 

x N(t) = t~ i/a Y u ^ 1/a ’ 

i= 1 

where (Ui) is an iid U(0, 1) sequence, independent of N(t). 

(ii) Prove that 

N(t) 

(N(t))~ 1/a Y U i 1/a ^ Z ° as t oo . 

i= 1 

Hint: Condition on N(t) and exploit (2.1.30). 

(iii) Use the strong law of large numbers N(t)/t ^ 1 as t — > oo (Theorem 2.2.4) 
and the continuous mapping theorem to conclude the proof. 

(d) Show that EX a = oo. 

(e) Let Z \ , . . . , Z n be iid copies of the a-stable random variable Z a with Laplace- 
Stieltjes transform Ee^ sZa = e~ cs , s > 0, for some a £ (0,1) and c > 0. 
Show that for every n > 1 the relation 

Z! + --- + Z„ = n 1/a Z a 

holds. It is due to this “stability condition” that the distribution gained its 
name. 

Hint: Use the properties of Laplace-Stieltjes transforms (see p. 182) to show this 
property. 
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(f) Consider Z a from (e) for some a £ (0, 1). 

(i) Show the relation 

Ee itAzi /2 _ e -c|t| 2 “ ^ t £R, (2.1.31) 

where A ~ N(0, 2) is independent of Z a . A random Y with characteristic func- 
tion given by the right-hand side of (2.1.31) and its distribution are said to be 
symmetric 2a-stable. 

(ii) Let Yi, ... ,Y n be iid copies of Y from (i). Show the stability relation 

Yi H \-Y n = n 1/(2a) Y . 

(iii) Conclude that Y must have infinite variance. Hint: Suppose that Y has 
finite variance and try to apply the central limit theorem. 

The interested reader who wants to learn more about the exciting class of stable 
distributions and stable processes is referred to Samorodnitsky and Taqqu [70]. 

Section 2.1.8 

(17) Let ( N(t))t>o be a standard homogeneous Poisson process with claim arrival 
times Tj. 

(a) Show that the sequences of arrival times {VTi) and (T?) define two Poisson 
processes Ni and N%, respectively, on [0, 00 ). Determine their mean measures 
by calculating ENi(s,t] for any s <t, i = 1,2. 

(b) Let N 3 and IV 4 be Poisson processes on [0, 00 ) with mean value functions /i 3 (t) = 

\Jt and = t 2 and arrival time sequences (t/ 3 -*) and (T- 4> ), respectively. 

Show that the processes ( N 3 (t 2 ))t>o and (A r 4 (v / t))t>o are Poisson on [0, 00 ) and 
have the same distribution. 

(c) Show that the process 

N 5 (t) = #{i> 1 :e Ti <t + l}, t> 0, 

is a Poisson process and determine its mean value function. 

(d) Let Nq be a Poisson process on [0, 00 ) with mean value function He(t) = log(l + 
t). Show that Ng has the property that, for 1 < s < t and a > 1, the distribution 
of Ne(at — 1) — Ne(as — 1) does not depend on a. 

(18) Let (T)) be the arrival times of a homogeneous Poisson process N on [0, 00 ) with 
intensity A > 0, independent of the iid claim size sequence (A';) with A \ > 0 
and distribution function F. 

(a) Show that for s < t and a < b the counting random variable 

M((s,t] x (a,&D = #{* > 1 : F £ (s,t] , X t £ (a, 6 ]} 
is Pois(A ( t — s)F(a, b]) distributed. 

(b) Let Ai = ( Si , ti\ x (at, bi ] for Si < U and m < bi, i = 1, 2, be disjoint. Show that 
M(Ai) and M(A 2 ) are independent. 

(19) Consider the two-dimensional PRM Nr from Figure 2.1.30 with a > 0. 

(a) Calculate the mean measure of the set A(r, S) =? {x : |x| > r , x/|x| £ S'}, where 
r > 0 and S is any Borel subset of the unit circle. 

(b) Show that EN^(A(rt, S)) = t~ a EN^(A(r, S)) for any t > 0. 

(c) Let Y = .R(cos(27r A) , sin(2 7r A)), where P(R > x) => x~ a , x > 1, A is 
uniformly distributed on (0, 1) and independent of R. Show that for r > 1, 

EN^(A(r,S)) = P(Y£A(r,S)). 
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(20) Let be a measure space such that 0 < /r(-E) < oo and r be Pois(/x(i?)) 

distributed. Assume that r is independent of the iid sequence (X,) with distri- 
bution given by 

F Xl (A) = P(X i € A) = ij,(A)/h(E) , A e£. 

(a) Show that the counting process 

T 

N(A) = Y, I A( x i), Ae£, 

i= 1 

is PRM(/j) on E. Hint: Calculate the joint characteristic function of the random 
variables N(Ai ), . . . , N(Am) for any disjoint Ai , . . . , A m £ £ . 

(b) Specify the construction of (a) in the case that E = [0, 1] equipped with the 
Borel tr-field, when fi has an a.e. positive density A. What is the relation with 
the order statistics property of the Poisson process N? 

(c) Specify the construction of (a) in the case that E = [0, l] d equipped with the 
Borel cr-held for some integer d > 1 when /r = ALeb for some constant A > 0. 
Propose how one could define an “order statistics property” for this (homoge- 
neous) Poisson process with points in E. 

(21) Let r be a Pois(l) random variable, independent of the iid sequence (X t ) with 
common distribution function F and a positive density on (0, oo). 

(a) Show that 

T 

N{t) = Y J h O,t](*0, t> 0, 

i= 1 

defines a Poisson process on [0, oo) in the sense of Definition 2.1.1. 

(b) Determine the mean value function of N. 

(c) Find a function / : [0, oo) — ♦ [0, oo) such that the time changed process 
{N(f{t))) t >o becomes a standard homogeneous Poisson process. 

(22) For an iid sequence (X,) with common continuous distribution function F define 
the sequence of partial maxima M n = max(Xl, . . . , X„), n > 1. Define L( 1) = 1 
and, for n > 1, 

L(n + 1) = inf {A: > L(n) : X k > X L(n) } . 

The sequence (A'^(„)) is called the record value sequence and ( L(n )) is the se- 
quence of the record times. 

It is well-known that for an iid standard exponential sequence ( Wi ) with record 
time sequence (L(n)), (ILj(„)) constitute the arrivals of a standard homogeneous 
Poisson process on [0,oo); see Resnick [64], Proposition 4.1. 

(a) Let R(x ) = — log F(x), where F = 1 — F and x £ ( xi,x r ), xi = infja: : F{x) > 
0} and x r = sup{a: : F(x) < 1}. Show that (X L ( n )) = (R* _ (H / j( n ))), where 
R^(t) = infja: £ ( xi,x r ) : R( x) > t} is the generalized inverse of R. See 
Resnick [64], Proposition 4.1. 

(b) Conclude from (a) that (A'^(„)) is the arrival sequence of a Poisson process on 
(xi,x r ) with mean measure of (a, 6] C ( xi,x r ) given by R(a,b], 
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2.2 The Renewal Process 

2.2.1 Basic Properties 

In Section 2.1.4 we learned that the homogeneous Poisson process is a partic- 
ular renewal process. In this section we want to study this model. We start 
with a formal definition. 

Definition 2.2.1 (Renewal process) 

Let ( Wi ) be an iid sequence of a.s. positive random variables. Then the random 
walk 



To — 0 , T n — W i + • • • + W n , n > 1 , 
is said to be a renewal sequence and the counting process 
N(t) = #{i > 1 : Ti < t} t > 0 , 
is the corresponding renewal (counting) process. 

We also refer to (T n ) and (W n ) as the sequences of the arrival and inter-arrival 
times of the renewal process N, respectively. 

Example 2.2.2 (Homogeneous Poisson process) 

It follows from Theorem 2.1.6 that a homogeneous Poisson process with in- 
tensity A is a renewal process with iid exponential Exp(A) inter-arrival times 
Wi. □ 

A main motivation for introducing the renewal process is that the (homoge- 
neous) Poisson process does not always describe claim arrivals in an adequate 
way. There can be large gaps between arrivals of claims. For example, it is 
unlikely that windstorm claims arrive according to a homogeneous Poisson 
process. They happen now and then, sometimes with years in between. In 
this case it is more natural to assume that the inter-arrival times have a dis- 
tribution which allows for modeling these large time intervals. The log-normal 
or the Pareto distributions would do this job since their tails are much heavier 
than those of the exponential distribution; see Section 3.2. We have also seen 
in Section 2.1.7 that the Poisson process is not always a realistic model for 
real-life claim arrivals, in particular if one considers long periods of time. 

On the other hand, if we give up the hypothesis of a Poisson process we 
lose most of the nice properties of this process which are closely related to the 
exponential distribution of the Wf s. For example, it is in general unknown 
which distribution N(t) has and what the exact values of EN(t) or var(iV(t)) 
are. We will, however, see that the renewal processes and the homogeneous 
Poisson process have various asymptotic properties in common. 

The first result of this kind is a strong law of large numbers for the renewal 
counting process. 
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Figure 2.2.3 One path of a renewal process (left graphs) and the corresponding 
inter-arrival times (right graphs). Top: Standard homogeneous Poisson process with 
iid standard exponential inter-arrival times. Bottom: The renewal process has iid 
Pareto distributed inter-arrival times with P(Wi > x) = x~ 4 , x > 1. Both renewal 
paths have 100 jumps. Notice the extreme lengths of some inter-arrival times in the 
bottom graph; they are atypical for a homogeneous Poisson process. 



Theorem 2.2.4 (Strong law of large numbers for the renewal process) 

If the expectation EW\ = A^ 1 of the inter-arrival times W; is finite, N satis- 
fies the strong law of large numbers: 

i»AW =a a ,. 

£— »oo t 



Proof. We need a simple auxiliary result. 
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Figure 2.2.5 Five paths of a renewal process with X = 1 and n = 10 ! jumps, 
i = 2, 3,4, 5. The mean value function EN(t) = t is also indicated ( solid straight 
line). The approximation of N(t) by EN(t) for increasing t is nicely illustrated; on 
a large time scale N(t) and EN(t) can hardly be distinguished. 

Lemma 2.2.6 Let (Z n ) be a sequence of random variables such that Z n Z 
as n — y oo for some random variable Z, and let (M(t))t > o be a stochastic 
process of integer-valued random variables such that M(t ) oo as t — > oo. If 

M and (Z n ) are defined on the same probability space fi, then 

Z M {t ) — ► Z a.s. as t — > oo. 

Proof. Write 

f2i = {w € fi : — > oo} and = {w G 17 : Z n (u) — > Z(u>)} . 

By assumption, P{fl i) = = 1, hence P(f2i IT = 1 and therefore 
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P{{co : ■Z’M(t,w)( w ) ~~ ► Z(w)}) > P(f2 1 fl f?2) — 1 • 

This proves the lemma. □ 

Recall the following basic relation of a renewal process: 

{N(t) = n} = {T n < t < T n+ 1 } , n e N 0 . 



Then it is immediate that the following sandwich inequalities hold: 



T, 



N(t) 



< 



t < ^tv(t)+ 1 TV(t) + 1 



N(t) ~ N(t) ~ N(t) + 1 N(t) 

By the strong law of large numbers for the iicl sequence (W n ) we have 



(2.2.32) 



— 1 rrt a.S. > — 2. 

n I r, — > A 



In particular, N(t) — ■» oo a.s. as t — > oo. Now apply Lemma 2.2.6 with Z n = 
T n /n and M = TV to obtain 



T mt) 

N(t ) 



A" 1 . 



(2.2.33) 



The statement of the theorem follows by a combination of (2.2.32) and 
(2.2.33). □ 

In the case of a homogeneous Poisson process we know the exact value of 
the expected renewal process: EN(t ) = A t. In the case of a general renewal 
process N the strong law of large numbers N(t)/t A = {EWi) -1 suggests 
that the expectation EN(t ) of the renewal process is approximately of the 
order At. A lower bound for EN[t)/t is easily achieved. By an application of 
Fatou’s lemma (see for example Williams [78])) and the strong law of large 
numbers for N(t), 



A = Alim inf < liminf 



(2.2.34) 



This lower bound can be complemented by the corresponding upper one which 
leads to the following standard result. 

Theorem 2.2.7 (Elementary renewal theorem) 

If the expectation EW\ = A -1 of the inter-arrival times is finite, the following 
relation holds: 



lim 

t—* OO 



EN(t) 

t 



= A. 



Proof. By virtue of (2.2.34) it remains to prove that 



lim sup ^ ^ (2.2.35) 

t—> OO t 
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t t 

Figure 2.2.8 The ratio N(t)/t for a renewal process with n = 10 ! jumps, i = 
2,3, 4, 5, and A = 1. The strong law of large numbers forces N(t)/t towards 1 for 
large t. 



We use a truncation argument which we borrow from Resnick [65], p. 191. 
Write for any b > 0, 

Wf } = min(Wi , b) , T x (b) = w[ b) + ■ ■ ■ + wf } , i > 1 . 

Obviously, ( Tn ^ is a renewal sequence and T n > Tn' 1 which implies N},{t) > 
N{i) for the corresponding renewal process 

N b {t ) = #{i > 1 : 71 (b) < t} , t> 0 . 



Hence 



lim sup 

t—> OO 



EN(t) 

t 



< lim sup 

t—* OO 



EN b (t) 

t 



(2.2.36) 







64 



2 Models for the Claim Number Process 





0 500 1000 1500 0 500 1000 1500 2000 

t t 




0 1000 2000 3000 4000 

I 



Figure 2.2.9 Visualization of the validity of the strong law of large numbers for 
the arrivals of the Danish fire insurance data 1980 — 1990; see Section 2.1.7 for a 
description of the data. Top left: The ratio N(t)/t for 1980 — 1984, where N(t) is 
the claim number at day t in this period. The values cluster around the value 0.46 
which is indicated by the constant line. Top right: The ratio N(t)/t for 1985— 1990, 
where N{t) is the claim number at day t in this period. The values cluster around 
the value 0.61 which is indicated by the constant line. Bottom: The ratio N(t)/t for 
the whole period 1980— 1990, where N(t) is the claim number at day t in this period. 
The graph gives evidence about the fact that the strong law of large numbers does 
not apply to N for the whole period. This is caused by an increase of the annual 
intensity in 1985— 1990 which can be observed in Figure 2.1.21. This fact makes the 
assumption of iid inter-arrival times over the whole period of 11 years questionable. 
We do, however, see in the top graphs that the strong law of large numbers works 
satisfactorily in the two distinct periods. 
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We observe that, by definition of Nb, 



rp{b) 

1 N b (t) 



= Wf } + 



w , 



(b) 



N b (t) 



< t. 



The following result is due to the fact that Nb(t) + 1 is a so-called stopping 
time 19 with respect to the natural filtration generated by the sequence ( ). 
Then the relation 



E ( T mt)+i) = WW + !) EW i b) (2. 2. 37) 



holds by virtue of Wald’s identity. Combining (2.2.36)-(2.2.37), we conclude 
that 



lim sup 

t—*oo 



EN(t) 

t 



< lim sup 

t—*oo 



E < T Z, W > 

tEW[ b) 



< lim sup 

t—> OO 



t b 
tEW[ b) 



(Ew^y 1 . 



Since by the monotone convergence theorem (see for example Williams [78]), 
letting 6 | oo, 

EW[ b) = E(min(b, Wi)) | EW X = A” 1 , 



the desired relation (2.2.35) follows. This concludes the proof. □ 

For further reference we include a result about the asymptotic behavior of 
var (N(t)). The proof can be found in Gut [40], Theorem 5.2. 

Proposition 2.2.10 (The asymptotic behavior of the variance of the renewal 
process) 

Assume var(VFi) < oo. Then 

var (N(t)) var(VFi) 

t^o t = jEWiy ' 

Finally, we mention that N(t) satisfies the central limit theorem; see Em- 
brechts et al. [29], Theorem 2.5.13, for a proof. 

Theorem 2.2.11 (The central limit theorem for the renewal process) 
Assume that var(lTi) < oo. Then the central limit theorem 



(var(Wi) (EW^)- 3 1)~ 1/2 ( N(t ) - At) 4 F ~ N(0, 1) . (2.2.38) 



holds as t — > oo. 

19 Let T n = cr(Wy ,i < n) be the cr-field generated by W[ b \ . . . , Wn b \ Then 
(tFn) is the natural filtration generated by the sequence (Wn^). An integer-valued 
random variable r is a stopping time with respect to (tF n ) if {r = n} £ T n . 
If Et < oo Wald’s identity yields E (E[=i W^ = Er EW[ b) . Notice that 

{Nb(t) = n} = {T^ < t < T^j}. Hence Nb(t) is not a stopping time. However, 
the same argument shows that Nb(t) + 1 is a stopping time with respect to (E n ). 
The interested reader is referred to Williams’s textbook [78] which gives a concise 
introduction to discrete-time martingales, filtrations and stopping times. 
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By virtue of Proposition 2.2.10, the normalizing constants •\/var(M / i)(£ , W r i) _3 t 
in (2.2.38) can be replaced by the standard deviation y / var(A f (t)). 



2.2.2 An Informal Discussion of Renewal Theory 

Renewal processes model occurrences of events happening at random instants 
of time, where the inter-arrival times are approximately iid. In the context of 
non-life insurance these instants were interpreted as the arrival times of claims. 
Renewal processes play a major role in applied probability. Complex stochastic 
systems can often be described by one or several renewal processes as building 
blocks. For example, the Internet can be understood as the superposition of 
a huge number of ON/OFF processes. Each of these processes corresponds to 
one “source” (computer) which communicates with other sources. ON refers 
to an active period of the source, OFF to a period of silence. The ON/OFF 
periods of each source constitute two sequences of iid positive random vari- 
ables, both defining renewal processes. 20 A renewal process is also defined by 
the sequence of renewals (times of replacement) of a technical device or tool, 
say the light bulbs in a lamp or the fuel in a nuclear power station. From these 
elementary applications the process gained its name. 

Because of their theoretical importance renewal processes are among the 
best studied processes in applied probability theory. The object of main in- 
terest in renewal theory is the renewal function 21 

m(t) = EN(t) + 1 , t> 0 . 

It describes the average behavior of the renewal counting process. In the in- 
surance context, this is the expected number of claim arrivals in a portfolio. 
This number certainly plays an important role in the insurance business and 
its theoretical understanding is therefore essential. The iid assumption of the 
inter-arrival times is perhaps not the most realistic but is convenient for build- 
ing up a theory. 

The elementary renewal theorem (Theorem 2.2.7) is a simple but not very 
precise result about the average behavior of renewals: m(t) = At (1 + o(l)) as 
t — > oo, provided EW\ = A -1 < oo. Much more precise information is gained 
by Blackwell's renewal theorem. It says that for h > 0, 

ro(t, t+ h\ = EN(t , t + h] — > A h , t — > oo . 

20 The approach to tele-traffic via superpositions of ON/OFF processes became 
popular in the 1990s; see Willinger et al. [79]. 

21 The addition of one unit to the mean EN(t ) refers to the fact that To = 0 is often 
considered as the first renewal time. This definition often leads to more elegant 
theoretical formulations. Alternatively, we have learned on p. 65 that the process 
N(t) + 1 has the desirable theoretical property of a stopping time, which N(t) 
does not have. 
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(For Blackwell’s renewal theorem and the further statements of this section we 
assume that the inter-arrival times IT) have a density.) Thus, for sufficiently 
large t, the expected number of renewals in the interval (t, t + h\ becomes 
independent of t and is proportional to the length of the interval. Since m is 
a non-decreasing function on [0,oo) it defines a measure m (we use the same 
symbol for convenience) on the Borel cr-field of [0,oo), the so-called renewal 
measure. 

A special calculus has been developed for integrals with respect to the re- 
newal measure. In this context, the crucial condition on the integrands is called 
direct Riemann integrability. Directly Riemann integrable functions on [0, oo) 
constitute quite a sophisticated class of integrands; it includes Riemann inte- 
grable functions on [0, oo) which have compact support (the function vanishes 
outside a certain finite interval) or which are non-increasing and non-negative. 
The key renewal theorem states that for a directly Riemann integrable func- 
tion /, 



f(t — s ) dm{s) — > A / f(s) ds . 



(2.2.39) 



Under general conditions, it is equivalent to Blackwell’s renewal theorem 
which, in a sense, is a special case of (2.2.39) for indicator functions /( x) = 
I(o, h] (%) with h > 0 and for t > h: 



/ f(t — s)dm(s)= / I(o,h](t ~ s ) dm(s) = m(t — h,t\ 
Jo J t—h 

nOO 

A / f(s) ds = Xh . 

Jo 



An important part of renewal theory is devoted to the renewal equation. 
It is a convolution equation of the form 



U(t) = u(t) + [ U(t- y) dF Tl (y) , (2.2.40) 

Jo 

where all functions are defined on [0,oo). The function U is unknown, u is a 
known function and Ft, is the distribution function of the iid positive inter- 
arrival times Wi = Ti — Ti_i. The main goal is to find a solution U to (2.2.40). 
It is provided by the following general result which can be found in Resnick 
[65], p. 202. 

Theorem 2.2.12 (W. Smith’s key renewal theorem) 

(1) If u is bounded on every finite interval then 



U(t) = f u(t — s ) dm(s ) , t > 0 , 
Jo 



(2.2.41) 
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is the unique solution of the renewal equation (2.2.40) in the class of all 
functions on (0,oo) which are bounded on finite intervals. Here the right- 
hand integral has to be interpreted as t j u(t — s) dm{s) with the con- 
vention that m(s) = u(s) = 0 for s < 0. 

(2) If in addition, u is directly Riemann integrable, then 

POO 

lim U (t) = A / u(s) ds . 
t— > °° Jo 

Part (2) of the theorem is immediate from Blackwell’s renewal theorem. 

The renewal function itself satisfies the renewal equation with u = I[o,co)- 

From this fact the general equation (2.2.40) gained its name. 

Example 2.2.13 (The renewal function satisfies the renewal equation) 

Observe that for t > 0, 

( oo \ oo 

^/ [M (r„) =l + ^P(T„<t) 

n— 1 / n—1 

OO r.t 

= I[0,oo) {t) + / P (y + ( T n-T i) < t) dF Tl (y) 

n = 1 J 0 



Pt oo 

= I[ 0 ,oo) {t)+ X! P ( T n-i <t-y) dF Tl ( y ) 

n = 1 

= ^[0,oo) (t) + / rn(t - y) dF Tl (y) . 

Jo 

This is a renewal equation with U(t) = m(t) and u(t) = Iy o j00 )(f)- □ 

The usefulness of the renewal equation is illustrated in the following example. 

Example 2.2.14 (Recurrence times of a renewal process) 

In our presentation we closely follow Section 3.5 in Resnick [65]. Consider a 
renewal sequence ( T n ) with Tq = 0 and W n > 0 a.s. Recall that 



{N(t) = n} = {T n < t < T n+ 1 } . 



In particular, T/v(t) < t < T/v(t)+i- For t > 0, the quantities 

F(t) = T N(t)+1 - t and B(t) =t- T NW 

are the forward and backward recurrence times of the renewal process, respec- 
tively. For obvious reasons, F(t) is also called the excess life or residual life, 
i.e., it is the time until the next renewal, and B{t) is called the age process. In 
an insurance context, F(t) is the time until the next claim arrives, and B(t) 
is the time which has evolved since the last claim arrived. 
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It is our aim to show that the function P(B(t) < x) for fixed 0 < x < t 
satisfies a renewal equation. It suffices to consider the values x < t since 
B(t) < t a.s., hence P(B(t) < x) = 1 for x > t. We start with the identity 

P(B(t) < x) = P(B(t) < x , Ti < t) + P(B(t) < x , T\ > t) , x > 0 . 

(2.2.42) 

If T\ > t, no jump has occurred by time t, hence N(t) = 0 and therefore 
B(t) = t. We conclude that 

P(B(t) <x,T 1 >t) = (l-F Tl (*)) I [0 ,x] (t) ■ (2.2.43) 

For Tj < t, we want to show the following result: 

P(B(t) <x,T 1 <t)= f P(B(t -y)<x) dF Tl (y) . (2.2.44) 

Jo 

This means that, on the event {T\ < t}, the process B “starts from scratch” 
at Xj. We make this precise by exploiting a “typical renewal argument”. First 
observe that 



<x,T 1 <t) = P(t - Ty(t) < x , N(t) > 1) 



= y P N {t) < x , N(t) = n) 

n= 1 



= P(t -T n <x,T n <t< T n+ 1 ) . 

n = 1 

We study the summands individually by conditioning on {Ti = y} for y < t: 
P(t-T n <x,T n <t< T n+ 1 | T\ = y) 



= P 




y + J2 w * 

i=2 



i = 2 



= P(t-y- T„_ i < x , T„_ i <t-y <T n ) 

= P(t-y- T N (t- y ) < x , N(t -y) = n- l) . 
Hence we have 



P(B(t) < x ,T\ < t) 

OO r t 



= V) / P(t-y- T N(t-y) < X , fV(t - y)=n ) dX Tl (y) 

n=0 ^ ® 

= [ p {B{t -y)<x) dF Tl (y ) , 
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which is the desired relation (2.2.44). Combining (2.2.42)-(2.2.44), we arrive 
at 

P(B(t) <x) = (l- F Tl (t)) I [0:X] (t) + [ P(B(t -y)<x) dF Tl (: y ) . 

Jo 

(2.2.45) 

This is a renewal equation of the form (2.2.40) with u(t) = (1 — Ft ± ( t )) I[o, x ]{t), 
and U(t) = P(B(t) < x ) is the unknown function. 

A similar renewal equation can be given for P(F(t) > x ): 

P(F(t) > x) = [ P(F(t - y) > x) dF Tl (y) + (1 - F Tl {t + x)) . 

Jo 

(2.2.46) 

We mentioned before, see (2.2.41), that the unique solution to the renewal 
equation (2.2.45) is given by 

U (t) = P(B(t ) < x) = f (1 - F Tl ( t-y )) J[o,x] (t - y) dm{y) . 

Jo 

(2.2.47) 

Now consider a homogeneous Poisson process with intensity A. In this case, 
m(t) = EN{t ) + l = Ai+l, 1 — Ft^x) = exp{— Aa’}. From (2.2.47) for x < t 
and since B(t ) < t a.s. we obtain 



P{B{t) < x) = P(t - Tjv(t) < x) = 


( l-e~ Xx 

l 1 


if x < t 
if x > t 


A similar argument yields for F(t), 






P(F(t) < x) = P(Tjv ( t )+ 1 - t<x) 


= l-e- Ax , 


x > 0 . 



The latter result is counterintuitive in a sense since, on the one hand, the 
inter-arrival times IT); are Exp(A) distributed and, on the other hand, the 
time T/vp)+i — t until the next renewal has the same distribution. This reflects 
the forgetfidness property of the exponential distribution of the inter-arrival 
times. We refer to Example 2.1.7 for further discussions and a derivation of 
the distributions of B(t ) and F(t) for the homogeneous Poisson process by 
elementary means. □ 

Comments 

Renewal theory constitutes an important part of applied probability the- 
ory. Resnick [65] gives an entertaining introduction with various applications, 
among others, to problems of insurance mathematics. The advanced text on 




2.3 The Mixed Poisson Process 



71 



stochastic processes in insurance mathematics by Rolski et al. [67] makes ex- 
tensive use of renewal techniques. Gut’s book [40] is a collection of various 
useful limit results related to renewal theory and stopped random walks. 

The notion of direct Riemann integrability has been discussed in vari- 
ous books; see Alsmeyer [1], p. 69, Asmussen [5], Feller [32], pp. 361-362, or 
Resnick [65], Section 3.10.1. 

Smith’s key renewal theorem will also be key to the asymptotic results on 
the ruin probability in the Cramer-Lundberg model in Section 4.2.2. 

Exercises 

(1) Let ( Ti ) be a renewal sequence with To = 0, T„ = Wi + • • ■ + W„, where (Wi) 
is an iid sequence of non-negative random variables. 

(a) Which assumption is needed to ensure that the renewal process N(t) = =ff{i > 
1 : Ti < t} has no jump sizes greater than 1 with positive probability? 

(b) Can it happen that (Tf) has a limit point with positive probability? This would 
mean that N(t) = oo at some finite time t. 

(2) Let A be a homogeneous Poisson process on [0, oo) with intensity A > 0. 

(a) Show that N(t) satisfies the central limit theorem as t — > oo i.e., 

N(t) = N(t j^ Xt ±Y~ N(0, 1) , 

(i) by using characteristic functions, 

(ii) by employing the known central limit theorem for the sequence ((N(n) — 
An)/\/A n)n= 1 , 2 ,..., and then by proving that max t€ ( Bin+1 ] (IV(i) — N(n))/y/n —> 

0. 

(b) Show that N satisfies the multivariate central limit theorem for any 0 < si < 
• • • < Sn as t — > 00 : 

(VXt)- 1 ( N(sit)-siXt ... ,N(s n t) -s n Xt) 4 Y~N(0,E), 

where the right-hand distribution is multivariate normal with mean vector zero 
and covariance matrix £ whose entries satisfy <Jij = min(si, Sj), i,j = 1 , . . . , n. 

(3) Let F(t) — T/v( t ) +1 — t be the forward recurrence time from Example 2.2.14. 

(a) Show that the probability P(F(t ) > x), considered as a function of t, for x > 0 
fixed satisfies the renewal equation (2.2.46). 

(b) Solve (2.2.46) in the case of iid Exp(A) inter-arrival times. 

2.3 The Mixed Poisson Process 

In Section 2.1.3 we learned that an inhomogeneous Poisson process N with 
mean value function /.t can be derived from a standard homogeneous Poisson 
process A by a deterministic time change. Indeed, the process 

N (»(*)) , t > 0 , 

has the same finite-dimensional distributions as N and is cadlag, hence it is a 
possible representation of the process N. In what follows, we will use a similar 
construction by randomizing the mean value function. 




N(t) 
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Definition 2.3.1 (Mixed Poisson process) 

Let N be a standard homogeneous Poisson process and p be the mean value 
function of a Poisson process on [0,oo). Let 9 > 0 a.s. be a (non- degenerate) 
random variable independent of N. Then the process 

N(t) = N(9 fj(t)) , t> 0, 

is said to be a mixed Poisson process with mixing variable 9. 





t t 



Figure 2.3.2 Left: Ten sample paths of a standard homogeneous Poisson process. 
Right: Ten sample paths of a mixed homogeneous Poisson process with p(t) = t. The 
mixing variable 9 is standard exponentially distributed. The processes in the left and 
right graphs have the same mean value function EN(t) = t. 



Example 2.3.3 (The negative binomial process as mixed Poisson process) 
One of the important representatives of mixed Poisson processes is obtained 
by choosing p(t) = t and 9 gamma distributed. First recall that a r( 7,/?) 
distributed random variable 9 has density 

fe( x ) = X 7 ~ 1 e~ 0x , x > 0 . (2.3.48) 

r (V 

Also recall that an integer-valued random variable Z is said to be negative 
binomially distributed with parameter (p 1 v) if it has individual probabilities 

p(z = k)= (^ + k k ~ l ^p v (i-p) k , fce No, p e (0,1) , v>o. 

Verify that N(t) is negative binomial with parameter ( p , v) = (/?/(< + /?), 7). □ 
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In an insurance context, a mixed Poisson process is introduced as a claim 
number process if one does not believe in one particular Poisson process as 
claim arrival generating process. As a matter of fact, if we observed only one 
sample path N(9(u>)p,(t),u) of a mixed Poisson process, we would not be able 
to distinguish between this kind of process and a Poisson process with mean 
value function 9{iv)p. However, if we had several such sample paths we should 
see differences in the variation of the paths; see Figure 2.3.2 for an illustration 
of this phenomenon. 

A mixed Poisson process is a special Cox process where the mean value 
function p is a general random process with non-decreasing sample paths, in- 
dependent of the underlying homogeneous Poisson process N. Such processes 
have proved useful, for example, in medical statistics where every sample path 
represents the medical history of a particular patient which has his/her “own” 
mean value function. We can think of such a function as “drawn” from a dis- 
tribution of mean value functions. Similarly, we can think of 9 representing 
different factors of influence on an insurance portfolio. For example, think of 
the claim number process of a portfolio of car insurance policies as a collection 
of individual sample paths corresponding to the different insured persons. The 
variable 9(ui) then represents properties such as the driving skill, the age, the 
driving experience, the health state, etc., of the individual drivers. 

In Figure 2.3.2 we see one striking difference between a mixed Poisson 
process and a homogeneous Poisson process: the shape and magnitude of the 
sample paths of the mixed Poisson process vary significantly. This property 
cannot be explained by the mean value function 

EN(t) = EN(9 p(t)) = E(E[N(6 p{t)) \ 9)) = E[9 pit)} = E9 n{t ) , t > 0 . 

Thus, if E9 = 1, as in Figure 2.3.2, the mean values of the random variables 
N(p(t)) and N(t) are the same. The differences between a mixed Poisson 
and a Poisson process with the same mean value function can be seen in the 
variances. First observe that the Poisson property implies 

E(N(t ) | 9) = 9 p{t) and var(iV(f) | 9) = 9 . (2.3.49) 

Next we give an auxiliary result. Its prove is left as an exercise. 

Lemma 2.3.4 Let A and B be random variables such that var(A) < oo. Then 

var(A) = .E[var(A | B )] + var (E[A \ B ]) . 

An application of this formula with A = N(t) = N(9p(t)) and B = 9 together 
with (2.3.49) yields 
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var (N(t)) = E[var(N(t) | 0)] + var (E[N(t) \ 9\) 

= E[9 p(t)] + var {9 /i(f)) 

= E9 p.{t) + var(0) (^(<)) 2 

-™<‘> + 

> £iV(f) , 

where we assumed that var(0) < oo and p{t) > 0. The property 

var (N(t)) > EN(t ) for any t > 0 with / i(t ) > 0 (2.3.50) 

is called over- dispersion. It is one of the major differences between a mixed 
Poisson process and a Poisson process N, where EN(t) = var(7V(<)). 

We conclude by summarizing some of the important properties of the 
mixed Poisson process; some of the proofs are left as exercises. 

The mixed Poisson process inherits the following properties of the Poisson 
process: 

• It has the Markov property, see Section 2.1.2 for some explanation. 

• It has the order statistics property: if the function p has a continuous a.e. 
positive intensity function A and N has arrival times 0 < Tf < T% < • ■ ■ , 
then for every t > 0, 

(T 1 ,...,T n \N(t) = n)±(X w ,...,X (n) ), 

where the right-hand side is the ordered sample of the iid random variables 
X\, . . . , X n with common density A (a:) /n(t), 0 < x < t; cf. Theorem 2.1.11. 

The order statistics property is remarkable insofar that it does not depend 
on the mixing variable 9. In particular, for a mixed homogeneous Poisson 
process the conditional distribution of (Ti, . . . , T N p)) given {N(t) = n} is the 
distribution of the ordered sample of iid U(0, t) distributed random variables. 

The mixed Poisson process loses some of the properties of the Poisson 
process: 

• It has dependent increments. 

• In general, the distribution of N(t) is not Poisson. 

• It is over- dispersed', see (2.3.50). 

Comments 

For an extensive treatment of mixed Poisson processes and their properties 
we refer to the monograph by Grandell [37]. It can be shown that the mixed 
Poisson process and the Poisson process are the only point processes on [0, oo) 
which have the order statistics property; see Kallenberg [47]; cf. Grandell [37], 
Theorem 6.6. 
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Exercises 



(1) 

(a) 

(b) 

(c) 

(d) 



(2) 

( 3 ) 

( 4 ) 

(a) 



(b) 



(c) 



( 5 ) 



Consider the mixed Poisson process (N(t))t> o = (N(9t))t> o with arrival times 
Ti, where IV is a standard homogeneous Poisson process on [0, oo) and 9 > 0 is 
a non-degenerate mixing variable with var(6>) < oo, independent of N. 

Show that N does not have independent increments. (An easy way of doing this 
would be to calculate the covariance of N(s,t] and N(x,y] for disjoint intervals 
{s,t} and ( x,y ].) 

Show that N has the order statistics property, i.e., given N(t) = n, (Ti, . . . , T n ) 
has the same distribution as the ordered sample of the iid U(0,f) distributed 
random variables Ui, . . . ,U n . 

Calculate P(N(t) = n) for n £ No- Show that N(t) is not Poisson distributed. 
The negative binomial distribution on {0, 1,2,. . .} has the individual probabil- 
ities 

p v (l-p) k , fee No, p €(0,1), v > 0 . 



Pk 



v + k — 1 
k 



Consider the mixed Poisson process N with gamma distributed mixing variable, 
i.e., 9 has P( 7 , 0) density 

fe{x) = e~' }x , x > 0 . 

Calculate the probabilities P(N(t) = k ) and give some reason why the process 
N is called negative binomial process. 

Give an algorithm for simulating the sample paths of an arbitrary mixed Poisson 
process. 

Prove Lemma 2.3.4. 

Let N(t) = N(9 1), t > 0, be mixed Poisson, where N is a standard homogeneous 
Poisson process, independent of the mixing variable 9. 

Show that N satisfies the strong law of large numbers with random limit 9: 

a.s. 

t 



Show the following “central limit theorem 
N(t) — 9t d 



V9t 



Y ~ N(0, 1) . 



Show that the “naive” central limit theorem does not hold by showing that 

N(t) - EN{t) a-s. 9-E9 
N /var(AT(f)) s/var(6) 



Here we assume that var (9) < oo. 

Let N(t) = N(6t),t > 0, be mixed Poisson, where N is a standard homogeneous 
Poisson process, independent of the mixing variable 9 > 0. Write Fg for the 
distribution function of 9 and Fg = 1 — Fg for its right tail. Show that the 
following relations hold for integer n > 1, 
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2 Models for the Claim Number Process 



P(Nlt)>n)=t f — e tx Fg(x)dx, 

Jo n! 



P(0< x\ N(t ) = n) 



SZy n *- vt dF e (y) 

f 0 °°y"e-ytdF e (y) ’ 



E{6 | N(t) 



f~V n+ 1 e- vt dF e {y) 
fo° y n e~ yt dFg(y) 
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The Total Claim Amount 



In Chapter 2 we learned about three of the most prominent claim number 
processes, N: the Poisson process in Section 2.1, the renewal process in Sec- 
tion 2.2, and the mixed Poisson process in Section 2.3. In this section we take 
a closer look at the total claim amount process, as introduced on p. 8: 

N(t) 

S(t) = J2 X i> *>°, (3-0-1) 

i—l 

where the claim number process N is independent of the iid claim size sequence 
{Xi). We also assume that X, > 0 a.s. Depending on the choice of the process 
N, we get different models for the process S. In Example 2.1.3 we introduced 
the Cramer-Lundberg model as that particular case of model (3.0.1) when N 
is a homogeneous Poisson process. Another prominent model for S is called 
renewal or Sparre- Anders on model ; it is model (3.0.1) when N is a renewal 
process. 

In Section 3.1 we study the order of magnitude of the total claim amount 
S(t) in the renewal model. This means we calculate the mean and the variance 
of S(t) for large t, which give us a rough impression of the growth of S(t) as 
f — > oo. We also indicate that S satisfies the strong law of large numbers and 
the central limit theorem. The information about the asymptotic growth of 
the total claim amount enables one to give advise as to how much premium 
should be charged in a given time period in order to avoid bankruptcy or 
ruin in the portfolio. In Section 3.1.3 we collect some of the classical premium 
calculation principles which can be used as a rule of thumb for determining 
how big the premium income in a homogeneous portfolio should be. 

We continue in Section 3.2 by considering some realistic claim size distri- 
butions and their properties. We consider exploratory statistical tools (QQ- 
plots, mean excess function) and apply them to real-life claim size data in 
order to get a preliminary understanding of which distributions fit real-life 
data. In this context, the issue of modeling large claims deserves particular 
attention. We discuss the notions of heavy- and light-tailed claim size distribu- 
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tions as appropriate for modeling large and small claims, respectively. Then, 
in Sections 3.2.5 and 3.2.6 we focus on the subexponential distributions and 
on distributions with regularly varying tails. The latter classes contain those 
distributions which are most appropriate for modeling large claims. 

In Section 3.3 we study finally the distribution of the total claim amount 
S(t) as a combination of claim number process and claim sizes. We start 
in Section 3.3.1 by investigating some theoretical properties of the total 
claim amount models. By applying characteristic function techniques, we learn 
about mixture distributions as useful tools in the context of compound Poisson 
and compound geometric processes. We show that the summation of indepen- 
dent compound Poisson processes yields a compound Poisson process and we 
investigate consequences of this result. In particular, we show in the framework 
of the Cramer-Lundberg model that the total claim amounts from disjoint 
layers for the claim sizes or over disjoint periods of time are independent com- 
pound Poisson variables. We continue in Section 3.3.3 with a numerical recur- 
sive procedure for determining the distribution of the total claim amount. In 
the insurance world, this technique is called Panjer recursion. In Sections 3.3.4 
and 3.3.5 we consider alternative methods for determining approximations to 
the distribution of the total claim amount. These approximations are based 
on the central limit theorem or Monte Carlo techniques. 

Finally, in Section 3.4 we apply the developed theory to the case of reinsur- 
ance treaties. The latter are agreements between a primary and a secondary 
insurer with the aim to protect the primary insurer against excessive losses 
which are caused by very large claim sizes or by a large number of small and 
moderate claim sizes. We discuss the most important forms of the treaties and 
indicate how previously developed theory can be applied to deal with their 
distributional properties. 



3.1 The Order of Magnitude of the Total Claim Amount 

Given a particular model for S', one of the important questions for an insurance 
company is to determine the order of magnitude of S(f). This information is 
needed in order to determine a premium which covers the losses represented 
by S(t). 

Most desirably, one would like to know the distribution of S(t). This, how- 
ever, is in general a too complicated problem and therefore one often relies 
on numerical or simulation methods in order to approximate the distribu- 
tion of S ( t ) . In this section we consider some simple means in order to get a 
rough impression of the size of the total claim amount. Those means include 
the expectation and variance of S{t) (Section 3.1.1), the strong law of large 
numbers, and the central limit theorem for S(i) as t — > oo (Section 3.1.2). In 
Section 3.1.3 we study the relationship of these results with premium calcu- 
lation principles. 
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3.1.1 The Mean and the Variance in the Renewal Model 



The expectation of a random variable tells one about its average size. For 
the total claim amount the expectation is easily calculated by exploiting the 
independence of (X t ) and N(t), provided EN(t) and EX i are finite: 





r fmt) 


\1 


ES(t) = E 


ts 

'EM 


V(t) 



= E (. N(t ) EX l) = EN(t) EX L . 



Example 3.1.1 (Expectation of S(t) in the Cramer-Lundberg and renewal 
models) 

In the Cramer-Lundberg model, EN (t) = Xt, where A is the intensity of the 
homogeneous Poisson process N. Hence 



ES(t) = Xt.EXi . 



Such a compact formula does not exist in the general renewal model. However, 
given EW\ = A -1 < oo we know from the elementary renewal Theorem 2.2.7 
that EN(t)/t — > A a.s. as t — > oo. Therefore 

ES(t) = XtEXi (1 + o(l)) , t — * oo . 



This is less precise information than in the Cramer-Lundberg model. However, 
this formula tells us that the expected total claim amount grows roughly 
linearly for large t. As in the Cramer-Lundberg case, the slope of the linear 
function is determined by the reciprocal of the expected inter-arrival time 
EW\ and the expected claim size EX \ . □ 

The expectation does not tell one too much about the distribution of S(t). We 
learn more about the order of magnitude of S(t) if we combine the information 
about ES(t) with the variance var(S(f)). 

Assume that var (N(t)) and var(Xi) are finite. Conditioning on N(t) and 
exploiting the independence of N(t) and (A,), we obtain 





W(t) 




var 


Y.iXi-EXi) 

i= 1 


N(t) 



N(t) 

= ^2 var (Xi | N(t)) 

i- 1 



= N(t)v ar(Ai | N(t)) = iV(t)var(Ai) 
E I > 'Xi\N(t)\ =N(t)EX 1 . 

By virtue of Lemma 2.3.4 we conclude that 



' N(t ) 




i= 1 


N(t) 



var (S(t)) = E[N(t) var(Ai)] + var (N(t) EX t ) 

= EN(t ) var(Ai) + var (AT (t)) (EX i) 2 . 
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Example 3.1.2 (Variance of S(t) in the Cramer-Lundberg and renewal mod- 
els) 

In the Cramer-Lundberg model the Poisson distribution of N(t) gives us 
EN(t ) = var (N(t)) = At. Hence 

var(£(t)) = At [var(Vi) + (EX i) 2 ] = A tE(Xf) . 

In the renewal model we again depend on some asymptotic formulae for EN(t) 
and var(tV(t)); see Theorem 2.2.7 and Proposition 2.2.10: 

var (S(t)) = [Atvar(Xi) + var(H / i) A 3 t (EVi) 2 ] (1 + o(l)) 

= At [var(Xi) + var(IVi) A 2 (EX i) 2 ] (1 + o(l)) . 



□ 



We summarize our findings. 

Proposition 3.1.3 (Expectation and variance of the total claim amount in 
the renewal model) 

In the renewal model, if EW\ = A” 1 and EX i are finite, 



lim 

t—> OO 



ES(t) 

t 



XEXi , 



and z/var(W / i) and var(Xi) are finite, 



lim var (‘-’ W) _ ^ [var(Jfi) + var(IVi) A 2 (EX i) 2 l . 

t— >oo t L J 

In the Cramer-Lundberg model these limit relations degenerate to identities 
for every t > 0: 



ES(t ) = XtEXi and var (S(t)) = XtE(X^) . 



The message of these results is that in the renewal model both the expectation 
and the variance of the total claim amount grow roughly linearly as a function 
of t. This is important information which can be used to give a rule of thumb 
about how much premium has to be charged for covering the losses S(t): the 
premium should increase roughly linearly and with a slope larger than A EX i . 
In Section 3.1.3 we will consider some of the classical premium calculation 
principles and there we will see that this rule of thumb is indeed quite valuable. 



3.1.2 The Asymptotic Behavior in the Renewal Model 

In this section we are interested in the asymptotic behavior of the total claim 
amount process. Throughout we assume the renewal model (see p. 77) for 
the total claim amount process S. As a matter of fact, S(t ) satisfies quite a 
general strong law of large numbers and central limit theorem: 
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Figure 3.1.4 Visualization of the strong law of large numbers for the total claim 
amount S in the Cramer- Lundberg model with unit Poisson intensity. Five sam- 
ple paths of the process ( S(t)/t ) are drawn in the interval [0,1000]. Left: Stan- 
dard exponential claim sizes. Right: Pareto distributed claim sizes Xi = 1 + (Yi — 
EYi)/ \J var(Yi) for iid Yi ’s with distribution function P{Yi < x) = 1 — 2 4 x _4 , x > 2. 
These random variables have mean and variance 1. The fluctuations of S(t)/t around 
the mean 1 for small t are more pronounced than for exponential claim sizes. The 
right tail of the distribution of AT is much heavier than the right tail of the expo- 
nential distribution. Therefore much larger claim sizes may occur. 



Theorem 3.1.5 (The strong law of large numbers and the central limit the- 
orem in the renewal model) 

Assume the renewal model for S. 

(1) If the inter-arrival times Wi and the claim sizes Xi have finite expectation, 
S satisfies the strong law of large numbers: 

lim^- = XEX 1 a.s. (3.1.2) 

t—> OO t 



(2) If the inter-arrival times Wi and the claim sizes Xi have finite variance, 
S satisfies the central limit theorem: 



sup P 



( s(t) - ES(t) 

\ V / var(5'(t)) 





(3.1.3) 



where •h is the distribution function of the standard normal N(0, 1) distri- 
bution. 



Notice that the random sum process S satisfies essentially the same invariance 
principles, strong law of large numbers and central limit theorem, as the partial 
sum process 
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S n — X 1 + • • • + X n , n > 1 . 



Indeed, we know from a course in probability theory that ( S n ) satisfies the 
strong law of large numbers 

Q 

lim — = EX i a.s., (3.1.4) 

n—> oo Tl 

provided EX \ < oo, and the central limit theorem 



P 



S n - ES n 
\/var (S n ) 




<P{x ) , 






provided var(Xi) < oo. 

In both relations (3.1.2) and (3.1.3) we could use the asymptotic expres- 
sions for ES(t) and var suggested in Proposition 3.1.3 for normalizing 
and centering purposes. Indeed, we have 



lim 

t—> OO 



S(t) 

ES(t) 



= 1 



a.s. 



and it can be shown by using some more sophisticated asymptotics for ES(t) 
that as t — > oo, 



sup 

xSR 



P 



S(t ) -XEXit 



yj\ t [var(Xi) + var(lPi) A 2 (EX i) 2 



< x 



x ) 



0. 



We also mention that the uniform version (3.1.3) of the central limit the- 
orem is equivalent to the pointwise central limit theorem 



S(t) - ES(t) ^ \ 
\/var (S(t)) ~ ) 



<P{x ) , 



ieR. 



This is a consequence of the well-known fact that convergence in distribution 
with continuous limit distribution function implies uniformity of this conver- 
gence; see Billingsley [13]. 

Proof. We only prove the first part of the theorem. For the second part, we 
refer to Embrechts et al. [29], Theorem 2.5.16. We have 



S{t) S(t ) N(t) 

~T ~ N(t ) t 



(3.1.5) 



Write 



= {uj : N{t)/t — > A} and 1?2 = {w : S(t)/N(t ) — > EX i} . 

By virtue of (3.1.5) the result follows if we can show that P(f2i D D 2 ) = 1- 
However, we know from the strong law of large numbers for N (Theorem 2.2.4) 
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Figure 3.1.6 Top: Visualization of the strong law of large numbers for the Danish 
fire insurance data (left) and the US industrial fire data (right). For a description 
of these data sets, see Example 3.2.11. The curves show the averaged sample sizes 
Sn/n = (A'i + • • ■ + X n )/n as a function of n; the solid straight line represents 
the overall sample mean. Both claim size samples contain very large values. This 
fact makes the ratio S n /n converge to EX\ very slowly. Bottom: The quantities 
( S(t ) — ES(t))/^/ var (S(t)) for the Danish fire insurance data. The values of ES(t) 
and var (S(t)) were evaluated from the asymptotic expressions suggested by Propo- 
sition 3.1.3. From bottom to top, the constant lines correspond to the 1%-, 2.5%-, 
10%-, 50%-, 90%-, 97.5%-, 99 %-quantiles of the standard normal distribution. 



that P(f2i) = 1. Moreover, since N(t.) '—t oo, an application of the strong 
law of large numbers (3.1.4) and Lemma 2.2.6 imply that P(f2 2 ) = 1. This 
concludes the proof. □ 

The strong law of large numbers for the total claim amount process S is one 
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of the important results which any insurance business has experienced since 
the foundation of insurance companies. As a matter of fact, the strong law of 
large numbers can be observed in real-life data; see Figure 3.1.6. Its validity 
gives one confidence that large and small claims averaged over time converge 
to their theoretical mean value. The strong law of large numbers and the 
central limit theorem for S are backbone results when it comes to premium 
calculation. This is the content of the next section. 

3.1.3 Classical Premium Calculation Principles 

One of the basic questions of an insurance business is how one chooses a 
premium in order to cover the losses over time, described by the total claim 
amount process S. We think of the premium income p{f) in the portfolio of 
those policies where the claims occur as a deterministic function. 

A coarse, but useful approximation to the random quantity S(t) is given by 
its expectation ES(t). Based on the results of Sections 3.1.1 and 3.1.2 for the 
renewal model, we would expect that the insurance company loses on average 
if p(t) < ES(t) for large t and gains if p(t) > ES(t) for large t. Therefore 
it makes sense to choose a premium by “loading” the expected total claim 
amount by a certain positive number p. 

For example, we know from Proposition 3.1.3 that in the renewal model 

ES(t) = X EX 1 1 (1 + o(l)) , t — ■> oo . 

Therefore it is reasonable to choose p(t) according to the equation 

p(t) = (1 + p) ES(t) or p(t) = (1 + p) A EX\ t , (3.1.6) 

for some positive number p , called the safety loading. From the asymptotic 
results in Sections 3.1.1 and 3.1.2 it is evident that the insurance business is 
the more on the safe side the larger p. On the other hand, an overly large value 
p would make the insurance business less competitive: the number of contracts 
would decrease if the premium were too high compared to other premiums 
offered in the market. Since the success of the insurance business is based on 
the strong law of large numbers, one needs large numbers of policies in order 
to ensure the balance of premium income and total claim amount. Therefore, 
premium calculation principles more sophisticated than those suggested by 
(3.1.6) have also been considered in the literature. We briefly discuss some of 
them. 

• The net or equivalence principle. This principle determines the premium 
p(t) at time t as the expectation of the total claim amount S(t): 

TNet(f) = ES(t) . 

In a sense, this is the “fair market premium” to be charged: the insurance 
portfolio does not lose or gain capital on average. However, the central limit 
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theorem (Theorem 3.1.3) in the renewal model tells us that the deviation 
of S(t) from its mean increases at an order comparable to its standard 
deviation \J var(S(f)) as t — > oo. Moreover, these deviations can be both 
positive or negative with positive probability. Therefore it would be utterly 
unwise to charge a premium according to this calculation principle. It is of 
purely theoretical value, a “benchmark premium”. In Section 4.1 we will 
see that the net principle leads to “ruin” of the insurance business. 

• The expected value principle. 

PEv(t) = (1 + p) ES(t ) , 

for some positive safety loading p. The rationale of this principle is the 
strong law of large numbers of Theorem 3.1.5, as explained above. 

• The variance principle. 

Pvar(i) = ES(t) + avax(S(t )) , 

for some positive a. In the renewal model, this principle is equivalent in an 
asymptotic sense to the expected value principle with a positive loading. 
Indeed, using Proposition 3.1.3, it is not difficult to see that the ratio of 
the premiums charged by both principles converges to a positive constant 
as t — > oo, and a plays the role of a positive safety loading. 

• The standard deviation principle. 

Psu(t) = ES(t) + a \J var(S'(<)) , 

for some positive a. The rationale for this principle is the central limit 
theorem since in the renewal model (see Theorem 3.1.5), 

P(S(t) - psr>{t) < x ) — > <P(a) , i£t, 

where is the standard normal distribution function. Convince yourself 
that this relation holds. In the renewal model, the standard deviation 
principle and the net principle are equivalent in the sense that the ratio of 
the two premiums converges to 1 as t — > oo. This means that one charges 
a smaller premium by using this principle in comparison to the expected 
value and variance principles. 

The interpretation of the premium calculation principles depends on the un- 
derlying model. In the renewal and Cramer-Lundberg models the interpreta- 
tion follows by using the central limit theorem and the strong law of large 
numbers. If we assumed the mixed homogeneous Poisson process as the claim 
number process, the over-dispersion property, i.e., var (N(t)) > EN(t), would 
lead to completely different statements. For example, for a mixed compound 
homogeneous Poisson process Pvar(i)/PEv(i) — > oo as t — » oo. Verify this! 
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Figure 3.1.7 Visualization of the premium calculation principles in the Cramer- 
Lundberg model with Poisson intensity 1 and standard exponential claim sizes. Left: 
The premiums are: for the net principle pNet(f) = t, for the standard deviation 
principle pso(t) = t + 5y/2 1 and for the expected value principle pev(I) = l-3t for 
p = 0.3. Equivalently , pev(I) corresponds to the variance principle pv al (t) = 1.3 1 
with a = 0.15. One sample path of the total claim amount process S is also given. 
Notice that S(t) can lie above or below pNet(t). Right: The differences S(t) — p(t) 
are given. The upper curve corresponds to pNet- 



Comments 

Various other theoretical premium principles have been introduced in the 
literature; see for example Biihlmann [19], Kaas et al. [46] or Klugman et al. 
[51]. In Exercise 2 below one finds theoretical requirements taken from the 
actuarial literature that a “reasonable” premium calculation principle should 
satisfy. As a matter of fact, just one of these premium principles satisfies all 
requirements. It is the net premium principle which is not reasonable from an 
economic point of view since its application leads to ruin in the portfolio. 

Exercises 

(1) Assume the renewal model for the total claim amount process S with var(A'i) < 
oo and var(lTi) < oo. 

(a) Show that the standard deviation principle is motivated by the central limit 
theorem, i.e., as t — > oo, 

P(S(t) — psd if) < x) — > <P(a) , i£l, 

where $ is the standard normal distribution. This means that a is the d>(a)~ 
quantile of the normal distribution. 
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(b) 



(c) 



(d) 



( 2 ) 



( 3 ) 



( 4 ) 



(a) 

(b) 



Show that the net principle and the standard deviation principle are asymptot- 
ically equivalent in the sense that 



PNet (t) 
PSD (t) 



1 as t 



00 . 



Argue why the net premium principle and the standard deviation principle are 
“sufficient for a risk neutral insurer only”, i.e., these principles do not lead to 
a positive relative average profit in the long run: consider the relative gains 
(p(t) — ES(t))/ES(t) for large t. 

Show that for h > 0, 



lim ES(t-h,t] = h— ± 
t— » °o EWi 

Hint: Appeal to Blackwell’s renewal theorem; see p. 66. 

In the insurance literature one often finds theoretical requirements on the pre- 
mium principles. Here are a few of them: 

• Non-negative loading : p(t) > ES(t). 

• Consistency : the premium for S(t) + c is p(t) + c. 

• Additivity : for independent total claim amounts S(t) and S'(t) with corre- 
sponding premiums p(t) and p'(t), the premium for S(t) + S'(t) should be 
p(t) +p'(t). 

• Homogeneity or proportionality : for c > 0, the premium for cS(t) should be 
cp{t). 

Which of the premium principles satisfies these conditions in the Cramer- 
Lundberg or renewal models? 

Calculate the mean and the variance of the total claim amount S(t) under 
the condition that N is mixed Poisson with ( N(t))t>o = (N(9 t))t>o, where N 
is a standard homogeneous Poisson process, 9 > 0 is a mixing variable with 
var (9) < oo, and (A'j) is an iid claim size sequence with var(Xi) < oo. Show 
that 



PVar(f)/pEv(f) —>00, t ~ > OO . 

Compare the latter limit relation with the case when N is a renewal process. 
Assume the Cramer-Lundberg model with Poisson intensity A > 0 and consider 
the corresponding risk process 

U(t) = u + ct — S(t ) , 

where u > 0 is the initial capital in the portfolio, c > 0 the premium rate and S 
the total claim amount process. The risk process and its meaning are discussed in 
detail in Chapter 4. In addition, assume that the moment generating function 
mx 1 (h) = Eexp{hXi} of the claim sizes X., is finite in some neighborhood 
(—ho, ho) of the origin. 

Calculate the moment generating function of S(t) and show that it exists in 
(—ho, ho). 

The premium rate c is determined according to the expected value principle: c = 
(1 + p) XEXi for some positive safety loading p , where the value c (equivalently, 
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the value p) can be chosen according to the exponential premium principle. 1 For 
its definition, write v a ( u ) = e “ for u, a > 0. Then c is chosen as the solution 
to the equation 



v a (u) = E[v a (U(t)] for all t > 0 . (3.1.7) 

Use (a) to show that a unique solution c = c a > 0 to (3.1.7) exists. Calculate 
the safety loading p a corresponding to c a and show that p a > 0. 

(c) Consider c a as a function of a > 0. Show that lim a jo c a = A EX\ . This means 
that c a converges to the value suggested by the net premium principle with 
safety loading p = 0. 



3.2 Claim Size Distributions 

In this section we are interested in the question: 

What are realistic claim size distributions? 

This question is about the goodness of fit of the claim size data to the chosen 
distribution. It is not our goal to give sophisticated statistical analyzes, but 
we rather aim at introducing some classes of distributions used in insurance 
practice, which are sufficiently flexible and give a satisfactory fit to the data. 
In Section 3.2.1 we introduce QQ-plots and in Section 3.2.3 mean excess plots 
as two graphical methods for discriminating between different claim size dis- 
tributions. Since realistic claim size distributions are very often heavy-tailed, 
we start in Section 3.2.2 with an informal discussion of the notions of lreavy- 
and light-tailed distributions. In Section 3.2.4 we introduce some of the ma- 
jor claim size distributions and discuss their properties. In Sections 3.2.5 and 
3.2.6 we continue to discuss natural heavy-tailed distributions for insurance: 
the classes of the distributions with regularly varying tails and the subex- 
ponential distributions. The latter class is by now considered as the class of 
distributions for modeling large claims. 

3.2.1 An Exploratory Statistical Analysis: QQ-Plots 

We consider some simple exploratory statistical tools and apply them to simu- 
lated and real-life claim size data in order to detect which distributions might 
give a reasonable fit to real-life insurance data. We start with a quantile- 
quantile plot, for short QQ-plot, and continue in Section 3.2.3 with a mean 
excess plot. Quantiles correspond to the “inverse” of a distribution function, 
which is not always well-defined (distribution functions are not necessarily 
strictly increasing). We focus on a left-continuous version. 

1 This premium calculation principle is not intuitively motivated by the strong law 
of large numbers or the central limit theorem, but by so-called utility theory. 
The reader who wants to learn about the rationale of this principle is referred to 
Chapter 1 in Kaas et al. [46]. 
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Figure 3.2.1 A distribution function F on [0, oo) and its quantile function . In 
a sense, F^~ is the mirror image of F with respect to the line x = y. 



Definition 3.2.2 (Quantile function) 

The generalized inverse of the distribution function F, i.e., 

F*~(t) = infja: g R. : F(x) > f} , 0 < t < 1 , 

is called the quantile function of the distribution function F. The quantity 
Xt = F^~ ft) defines the f-quantile of F. 

If F is monotone increasing (such as the distribution function <P of the stan- 
dard normal distribution), we see that F^~ = F~ l on the image of F, i.e., 
the ordinary inverse of F. An illustration of the quantile function is given in 
Figure 3.2.1. Notice that intervals where F is constant turn into jumps of F^~ , 
and jumps of F turn into intervals of constancy for F^ . 

In this way we can define the generalized inverse of the empirical distribu- 
tion function F n of a sample X\, . . . , X n , i.e., 

1 n 

F n {x) = - V'/^oosjpQ) , i£l. (3.2.8) 

n j' 

It is easy to verify that F n has all properties of a distribution function: 

• lim x ^_oo F n ( x) = 0 and linx^oo F n ( x) = 1. 

• F n is non-decreasing: F n (x) < F n (y) for x < y. 

• F n is right-continuous: lim^a, F n (y) = F n (x) for every 
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Let X(!) < • • < X(„) be the ordered sample of X\, . . . , X n . In what follows, 
we assume that the sample does not have ties, i.e., < • • • < X („) a.s. For 

example, if the Xf s are iid with a density the sample does not have ties; see 
the proof of Lemma 2.1.9 for an argument. 

Since the empirical distribution function of a sample is itself a distribu- 
tion function, one can calculate its quantile function Fiff which we call the 
empirical quantile function. If the sample has no ties then it is not difficult to 
see that 

F n {X( k )) =k/n, k= 1, . . . , n , 

i.e., F n jumps by 1/n at every value X ^ and is constant in 

for k < n. This means that the empirical quantile function Fff jumps at the 

values k/n by X ^ — X(j._-p, and remains constant in ((A; — l)/n, k/n]: 

j x (k) t e ((k - l)/n, k/n ] , k=l,...,n-l, 

|x (n) t £ ((n — l)/n, 1) . 

A fundamental result of probability theory, the Glivenko-Cantelli lemma, 
(see for example Billingsley [13], p. 275) tells us the following: if X\, X 2 , ... is 
an iid sequence with distribution function F , then 

sup | F n {x) - F(x ) | ^ 0 , 

xSR 

implying that F n (x) ss F(x) uniformly for all x. One can show that the 
Glivenko-Cantelli lemma implies F£f(t) — + a.s. as n — » oo for all con- 

tinuity points t of F see Resnick [64], p. 5. This observation is the basic 
idea for the QQ-plot : if X\, . . . ,X n were a sample with known distribution 
function F, we would expect that F£~(t) is close to F*~(t) for all t G (0, 1), 
provided n is large. Thus, if we plot Ff~(t) against F^(t) for t € (0,1) we 
should roughly see a straight line. 

It is common to plot the graph 




for a given distribution function F. Modifications of the plotting positions 
have been used as well. Chambers [21] gives the following properties of a 
QQ-plot: 

(a) Comparison of distributions. If the data were generated from a random 
sample of the reference distribution, the plot should look roughly linear. 
This remains true if the data come from a linear transformation of the 
distribution. 

(b) Outliers. If one or a few of the data values are contaminated by gross error 
or for any reason are markedly different in value from the remaining val- 
ues, the latter being more or less distributed like the reference distribution, 
the outlying points may be easily identified on the plot. 
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empirical quantiles empirical quantiles 



Figure 3.2.3 QQ-plots for samples of size 1 000. Standard exponential (top left), 
standard log-normal (top right) and Pareto distributed data with tail index 4 (bottom 
left) versus the standard exponential quantiles. Bottom right: student 1 4 - distributed 
data versus the quantiles of the standard normal distribution. The ti-distribution has 
tails F(—x) = 1 — F(x) = cx~ 4 (l + o(l)) as x —> 00 , some c > 0, in contrast to the 
standard normal with tails <h(—x) = 1 — <h(x) = {\/2tcx)~ 1 exp{— x 2 /2}(1 + o(1)); 
see (3.2.9). 



(c) Location and scale. Because a change of one of the distributions by a linear 
transformation simply transforms the plot by the same transformation, 
one may estimate graphically (through the intercept and slope ) location 
and scale parameters for a sample of data, on the assumption that the 
data come from the reference distribution. 

(d) Shape. Some difference in distributional shape may be deduced from the 
plot. For example if the reference distribution has heavier tails (tends to 
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have more large values ) the plot will curve down at the left and/or up at 
the right. 

For an illustration of (a) and (d), also for a two-sided distribution, see Fig- 
ure 3.2.3. QQ-plots applied to real-life claim size data (Danish fire insurance, 
US industrial fire) are presented in Figures 3.2.5 and 3.2.15. QQ-plots applied 
to the Danish fire insurance inter-arrival times are given in Figures 2.1.22 and 
2.1.23. 

3.2.2 A Preliminary Discussion of Heavy- and Light-Tailed 
Distributions 

The Danish fire insurance data and the US industrial fire data presented in 
Figures 3.2.5 and 3.2.15, respectively, can be modeled by a very heavy-tailed 
distribution. Such claim size distributions typically occur in a reinsurance 
portfolio, where the largest claims are insured. In this context, the question 
arises: 

What determines a heavy-tailed/light-tailed claim size distribution? 

There is no clear-cut answer to this question. One common way to characterize 
the heaviness of the tails is by means of the exponential distribution as a 
benchmark. For example, if 



F 

limsup < oo for some A > 0, 

x — »oo 6 



where 



F( x) = 1 — F{x ) , x > 0 , 

denotes the right tail of the distribution function F, we could call F light- 
tailed. , and if 



liminf ^ > 0 for all A > 0, 

x — >oo 0 

we could call F heavy-tailed. 

Example 3.2.4 (Some well-known heavy- and light-tailed claim size distri- 
butions) 

From the above definitions, the exponential Exp(A) distribution is light-tailed 
for every A > 0. 

A standard claim size distribution is the truncated normal. This means 
that the A,;’s have distribution function F(x) = P(|T| < x) for a normally 
distributed random variable Y . If we assume Y standard normal, F{x) = 
2 (^(a:) — 0.5) for x > 0, where F is the standard normal distribution function 
with density 
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Figure 3.2.5 Top left: Danish fire insurance claim size data in millions of Danish 
Kroner (1985 prices). The data correspond to the period 1980 — 1992. There is a 
total of 2 493 observations. Top right: Histogram of the log-data. Bottom left: QQ- 
plot of the data against the standard exponential distribution. The graph is curved 
down at the right indicating that the right tail of the distribution of the data is 
significantly heavier than the exponential. Bottom right: Mean excess plot of the 
data. The graph increases in its whole domain. This is a strong indication of heavy 
tails of the underlying distribution. See Example 3.2.11 for some comments. 



e -z 2 / 2 

<p(x) = t = , x G R. . 

V Z7T 

An application of l’Hospital’s rule shows that 



lim — 

x—>oo X 



<fr{x) 

~VO) 



= 1 . 



(3.2.9) 
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The latter relation is often referred to as Mill’s ratio. With Mill’s ratio in mind, 
it is easy to verify that the truncated normal distribution is light-tailed. Using 
an analogous argument, it can be shown that the gamma distribution, for any 
choice of parameters, is light-tailed. Verify this. 

A typical example of a heavy-tailed claim size distribution is the Pareto 
distribution with tail parameter a > 0 and scale parameter k > 0, given by 




Another prominent heavy-tailed distribution is the Weibull distribution with 
shape parameter r < 1 and scale parameter c > 0: 



F(x) =e~ cx \ x > 0 . 



However, for r > 1 the Weibull distribution is light-tailed. We refer to Ta- 
bles 3.2.17 and 3.2.19 for more distributions used in insurance practice. □ 



3.2.3 An Exploratory Statistical Analysis: Mean Excess Plots 

The reader might be surprised about the rather arbitrary way in which we dis- 
criminated heavy-tailed distributions from light-tailed ones. There are, how- 
ever, some very good theoretical reasons for the extraordinary role of the 
exponential distribution as a benchmark distribution, as will be explained in 
this section. 

One tool in order to compare the thickness of the tails of distributions on 
[0,oo) is the mean excess function. 

Definition 3.2.6 (Mean excess function) 

Let Y be a non-negative random variable with finite mean, distribution F and 
Xi = inf {a; : F(x) > 0} and x r = sup{a; : F(x) < 1}. Then its mean excess 
or mean residual life function is given by 

cf(u) = E(Y — u | Y > u ) , u £ (xi,x r ) . 

For our purposes, we mostly consider distributions on [0,oo) which have sup- 
port unbounded to the right. The quantity (u) is often referred to as the 
mean excess over the threshold value u. In an insurance context, cf(u) can 
be interpreted as the expected claim size in the unlimited layer, over prior- 
ity u. Here BfIu) is also called the mean excess loss function. In a reliability 
or medical context, Cf{u) is referred to as the mean residual life function. In 
a financial risk management context, switching from the right tail to the left 
tail, Cf(u) is referred to as the expected shortfall. 

The mean excess function of the distribution function F can be written in 
the form 

bf(u) = =rr f F(y) dy , u<E[0,ay). (3.2.10) 

F{u) J u 
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This formula is often useful for calculations or for deriving theoretical prop- 
erties of the mean excess function. 

Another interesting relationship between e F and the tail F is given by 

F(x ) = CF [°| exp j— f — ]~ d y\ > ^ > 0 • (3.2.11) 

e-F{x) t Jo e F (y) J 

Here we assumed in addition that F is continuous and F(x) > 0 for all x > 0. 
Under these additional assumptions, F and e F determine each other in a 
unique way. Therefore the tail F of a non-negative distribution F and its 
mean excess function e F are in a sense equivalent notions. The properties of 
F can be translated into the language of the mean excess function e F and 
vice versa. 

Derive (3.2.10) and (3.2.11) yourself. Use the relation EY = J 0 °° P(Y > 
y) dy which holds for any positive random variable Y . 

Example 3.2.7 (Mean excess function of the exponential distribution) 
Consider Y with exponential Exp(A) distribution for some A > 0. It is an easy 
exercise to verify that 

e F (u) = A" 1 , u > 0 . (3.2.12) 

This property is another manifestation of the forgetfulness property of the 
exponential distribution; see p. 26. Indeed, the tail of the excess distribution 
function of Y satisfies 

P(Y > u + x \Y > u) = P(Y > x ) , x > 0 . 

This means that this distribution function corresponds to an Exp(A) random 
variable; it does not depend on the threshold u □ 

Property (3.2.12) makes the exponential distribution unique: it offers another 
way of discriminating between heavy- and light-tailed distributions of random 
variables which are unbounded to the right. Indeed, if e F (u) converged to 
infinity for u — > oo, we could call F heavy-tailed , if e F (u) converged to a finite 
constant as u — > oo, we could call F light-tailed. In an insurance context this is 
quite a sensible definition since unlimited growth of e F (u) expresses the danger 
of the underlying distribution F in its right tail, where the large claims come 
from: given the claim size A,; exceeded the high threshold u, it is very likely 
that future claim sizes pierce an even higher threshold. On the other hand, 
for a light-tailed distribution F, the expectation of the excess (X, — it) + (here 
x+ = max(0,a;)) converges to zero (as for the truncated normal distribution) 
or to a positive constant (as in the exponential case), given X, > u and the 
threshold u increases to infinity. This means that claim sizes with light-tailed 
distributions are much less dangerous (costly) than heavy-tailed distributions. 

In Table 3.2.9 we give the mean excess functions of some standard claim 
size distributions. In Figure 3.2.8 we illustrate the qualitative behavior of 
e F [u) for large u. 
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o 



Figure 3.2.8 Graphs of the mean excess functions cf{u) for some standard 
distributions; see Table 3.2.9 for the corresponding parameterizations. Note that 
heavy-tailed distributions typically have cf{u ) tending to infinity as u — > oo. 



Pareto 


K + U 

, a > 1 

a — 1 


Burr 


(1 + o(l)) , ar > 1 

ar — 1 


Log-gamma 


(1 + 0(1)) , a>l 

a — 1 


Log-normal 


2 

o U , , \ \ 

1 (l + o(l)) 

log u — fJ, 


Benktander type I 


u 

a + 2/3 log u 


Benktander type II 


u 1 -? 

a 


Weibull 


4-( 1 + °( 1 )) 

CT 


Exponential 


A” 1 


Gamma 


13 ( 1+ 0U + °(„)) 


Truncated normal 


u- 1 (l + o(l)) 



Table 3.2.9 Mean excess functions for some standard distributions. The parame- 
terization is taken from Tables 3.2.17 and 3.2.19. The asymptotic relations are to 
be understood for u — > oo. 
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If one deals with claim size data with an unknown distribution function 
F, one does not know the mean excess function eF- As it is often done in 
statistics, we simply replace F in eF by its sample version, the empirical 
distribution function F n ; see (3.2.8). The resulting quantity e_p n is called the 
empirical mean excess function. Since F n has bounded support, we consider 
e Fn only for u G [AT (1) , X (n) ): 



e f„ ( u) = Ep n (Y — u | Y > u) = 



EILi (* - «)+ 



E Fn {Y-u)+ 
F n {u) 



F n (u) 

An alternative expression for e Fn is given by 



(3.2.13) 



e F n ( u ) 



i:i<n ,Xi>u (Xj — u) 

#{i < n : Xi > u} 



An application of the strong law of large numbers to (3.2.13) yields the fol- 
lowing result. 

Proposition 3.2.10 Let Xi be iid non-negative random variables with dis- 
tribution function F which are unbounded to the right. If EX\ < oo, then for 
every u > 0, eF n (u) cf(u) as n — > oo. 

A graphical test for tail behavior can now be based on e Fn - A mean excess 
plot ( ME-plot ) consists of the graph 



{ 5 ^ F n (A)/^)) . k 1, . . . , 1 } • 



For our purposes, the ME-plot is used only as a graphical method, mainly for 
distinguishing between light- and heavy-tailed models; see Figure 3.2.12 for 
some simulated examples. Indeed caution is called for when interpreting such 
plots. Due to the sparseness of the data available for calculating e Fn {u ) for 
large tt-values, the resulting plots are very sensitive to changes in the data 
towards the end of the range; see Figure 3.2.13 for an illustration. For this 
reason, more robust versions like median excess plots and related procedures 
have been suggested; see for instance Beirlant et al. [10] or Rootzen and Tajvidi 
[68] . For a critical assessment concerning the use of mean excess functions in 
insurance, see Rytgaard [69]. 

Example 3.2.11 (Exploratory data analysis for some real-life data) 

In Figures 3.2.5 and 3.2.15 we have graphically summarized some properties of 
two real-life data sets. The data underlying Figure 3.2.5 correspond to Danish 
fire insurance claims in millions of Danish Kroner (1985 prices). The data 
were communicated to us by Mette Rytgaard and correspond to the period 
1980-1992, inclusively. There is a total of n = 2 493 observations. 
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Figure 3.2.12 The mean excess function plot for 1 000 simulated data and the 
corresponding theoretical mean excess function eF ( solid line): standard exponential 
(top left), log-normal (top right) with log A ~ N(0,4), Pareto (bottom) with tail 
index 1.7. 



The second insurance data, presented in Figure 3.2.15, correspond to a 
portfolio of US industrial fire data (n = 8 043) reported over a two year 
period. This data set is definitely considered by the portfolio manager as 
“dangerous”, i.e., large claim considerations do enter substantially in the final 
premium calculation. 

A first glance at the figures and Table 3.2.14 for both data sets immediately 
reveals heavy-tailedness and skewedness to the right. The corresponding mean 
excess functions are close to a straight line which fact indicates that the un- 
derlying distributions may be modeled by Pareto-like distribution functions. 
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Figure 3.2.13 The mean excess function of the Pareto distribution F(x) = a: -1 ' 7 , 
x > 1 , (straight line), together with 20 simulated mean excess plots each based on 
simulated data (n = 1 000) from the above distribution. Note the very unstable behav- 
ior, especially towards the higher values of u. This is typical and makes the precise 
interpretation of eF n (u) difficult; see also Figure 3.2.12. 



The QQ-plots against the standard exponential quantiles also clearly show 
tails much heavier than exponential ones. 



Data 


| Danish Industrial 


n 


2 493 


8 043 


min 


0.313 


0.003 


1st quartile 


1.157 


0.587 


median 


1.634 


1.526 


mean 


3.063 


14.65 


3rd quartile 


2.645 


4.488 


max 


263.3 


13 520 


# 0.99 


24.61 


184.0 



Table 3.2.14 Basic statistics for the Danish and the industrial fire data; So. 99 
stands for the empirical 99%-quantile. 



Comments 

The importance of the mean excess function (or plot) as a diagnostic tool 
for insurance data is nicely demonstrated in Hogg and Klugman [44] ; see also 
Beirlant et al. [10] and the references therein. 
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Figure 3.2.15 Exploratory data analysis of insurance claims caused by industrial 
fire: the data (top left), the histogram of the log-transformed data (top right), the 
ME-plot (bottom left) and a QQ-plot against standard exponential quantiles (bottom 
right). See Example 3.2.11 for some comments. 



3.2.4 Standard Claim Size Distributions and Their Properties 

Classical non-life insurance mathematics was most often concerned with claim 
size distributions with light tails in the sense which has been made precise in 
Section 3.2.3. We refer to Table 3.2.17 for a collection of such distributions. 
These distributions have mean excess functions cf(u) converging to some fi- 
nite limit as u — > oo, provided the support is infinite. For obvious reasons, 
we call them small claim distributions. One of the main reasons for the pop- 
ularity of these distributions is that they are standard distributions in statis- 
tics. Classical statistics deals with the normal and the gamma distributions, 







3.2 Claim Size Distributions 



101 






Figure 3.2.16 Exploratory data analysis of insurance claims caused by water: the 
data (top, left), the histogram of the log-transformed data (top, right), the ME-plot 
(bottom). Notice the kink in the ME-plot in the range (5 000,6 000) reflecting the 
fact that the data seem to cluster towards some specific upper value. 



among others, and in any introductory course on statistics we learn about 
these distributions because they have certain optimality conditions (closure 
of the normal and gamma distributions under convolutions, membership in 
exponential families, etc.) and therefore we can apply standard estimation 
techniques such as maximum likelihood. 

In Figure 3.2.16 one can find a claim size sample which one could model by 
one of the distributions from Table 3.2.17. Indeed, notice that the mean excess 
plot of these data curves down at the right end, indicating that the right tail of 
the underlying distribution is not too dangerous. It is also common practice to 
fit distributions with bounded support to insurance claim data, for example by 
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Name 


Tail F or density / 


Parameters 


Exponential 


F{x) = e~ Xx 


A > 0 


Gamma 


'<*>- rw 


a, (3 > 0 


Weibull 


F(x) = e~ cxT 


c > 0, r > 1 


Truncated normal 


f(x) = ^e~* 2 ' 2 


— 


Any distribution with bounded support 



Table 3.2.17 Claim size distributions : “small claims”. 



truncating any of the heavy-tailed distributions in Table 3.2.19 at a certain 
upper limit. This makes sense if the insurer has to cover claim sizes only 
up to this upper limit or for a certain layer. In this situation it is, however, 
reasonable to use the full data set (not just the truncated data) for estimating 
the parameters of the distribution. 

Over the last few years the (re-)insurance industry has faced new chal- 
lenges due to climate change, pollution, riots, earthquakes, terrorism, etc. 
We refer to Table 3.2.18 for a collection of the largest insured losses 1970- 
2002, taken from Sigma [73]. For this kind of data one would not use the 
distributions of Table 3.2.17, but rather those presented in Table 3.2.19. All 
distributions of this table are heavy-tailed in the sense that their mean excess 
functions cf{u) increase to infinity as u — » oo; cf. Table 3.2.9. As a matter 
of fact, the distributions of Table 3.2.19 are not easily fitted since various of 
their characteristics (such as the tail index a of the Pareto distribution) can 
be estimated only by using the largest upper order statistics in the sample. 
In this case, extreme value statistics is called for. This means that, based on 
theoretical (semi-)parametric models from extreme value theory such as the 
extreme value distributions and the generalized Pareto distribution, one needs 
to fit those distributions from a relatively small number of upper order statis- 
tics or from the excesses of the underlying data over high thresholds. We refer 
to Embrechts et al. [29] for an introduction to the world of extremes. 

We continue with some more specific comments on the distributions in 
Table 3.2.19. Perhaps with the exception of the log-normal distribution, these 
distributions are not most familiar from a standard course on statistics or 
probability theory. 

The Pareto, Burr, log-gamma and truncated a-stable distributions have 
in common that their right tail is of the asymptotic form 
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Losses 


Date 


Event 


Country 


20 511 


08/24/92 


Hurricane “Andrew” 


US, Bahamas 


19 301 


09/11/01 


Terrorist attack on WTC, Pentagon 
and other buildings 


US 


16 989 


01/17/94 


Northridge earthquake in California 


US 


7 456 


09/27/91 


Tornado “Mireille” 


Japan 


6 321 


01/25/90 


Winter storm “Daria” 


Europe 


6 263 


12/25/99 


Winter storm “Lothar” 


Europe 


6 087 


09/15/89 


Hurricane “Hugo” 


P. Rico, US 


4 749 


10/15/87 


Storm and floods 


Europe 


4 393 


02/26/90 


Winter storm “Vivian” 


Europe 


4 362 


09/22/99 


Typhoon “Bart” hits the south 
of the country 


Japan 


3 895 


09/20/98 


Hurricane “Georges” 


US, Caribbean 


3 200 


06/05/01 


Tropical storm “Allison”; flooding 


US 


3 042 


07/06/88 


Explosion on “Piper Alpha” offshore oil rig 


UK 


2 918 


01/17/95 


Great “Hanshin” earthquake in Kobe 


Japan 


2 592 


12/27/99 


Winter storm “Martin” 


France, Spain, CH 


2 548 


09/10/99 


Hurricane “Floyd” , heavy down-pours, 
flooding 


US, Bahamas 


2 500 


08/06/02 


Rains, flooding 


Europe 


2 479 


10/01/95 


Hurricane “Opal” 


US, Mexico 


2 179 


03/10/93 


Blizzard, tornadoes 


US, Mexico, Canada 


2 051 


09/11/92 


Hurricane “Iniki” 


US, North Pacific 


1 930 


04/06/01 


Hail, floods and tornadoes 


US 


1 923 


10/23/89 


Explosion at Philips Petroleum 


US 


1 864 


09/03/79 


Hurricane “Frederic” 


US 


1 835 


09/05/96 


Hurricane “Fran” 


US 


1 824 


09/18/74 


Tropical cyclone “Fifi” 


Honduras 


1 771 


09/03/95 


Hurricane “Luis” 


Caribbean 


1 675 


04/27/02 


Spring storm with several tornadoes 


US 


1 662 


09/12/88 


Hurricane “Gilbert” 


Jamaica 


1 620 


12/03/99 


Winter storm “Anatol” 


Europe 


1 604 


05/03/99 


Series of 70 tornadoes in the Midwest 


US 


1 589 


12/17/83 


Blizzard, cold wave 


US, Mexico, Canada 


1 585 


10/20/91 


Forest fire which spread to urban area 


US 


1 570 


04/02/74 


Tornados in 14 states 


US 


1 499 


04/25/73 


Flooding on the Mississippi 


US 


1 484 


05/15/98 


Wind, hail and tornadoes (MN, IA) 


US 


1 451 


10/17/89 


“Loma Prieta” earthquake 


US 


1 436 


08/04/70 


Hurricane “Celia” 


US 


1 409 


09/19/98 


Typhoon “Vicki” 


Japan, Philippines 


1 358 


01/05/98 


Cold spell with ice and snow 


Canada, US 


1 340 


05/05/95 


Wind, hail and flooding 


US 



Table 3.2.18 The 40 most costly insurance losses 1970 — 2002. Losses are in mil- 
lion $US indexed to 2002 prices. The table is taken from Sigma [73] with friendly 
permission of Swiss Re Zurich. 












104 3 The Total Claim Amount 



Name 


Tail F or density / 


Parameters 


Log-normal 


f(x) = J_ g -(logx-11) 2 /( 2 tr 2 ) 
y/2n crx 


p € R, a > 0 


Pareto 




a, k > 0 


Burr 




a, k,t > 0 


Benktander 
type I 


F(x) = (1 + 2(/3/a) log*) 

e -/3(loga;) 2 -(ct + l) log a; 


a, p > 0 


Benktander 
type II 


F(x) = e a//3 *- (1 " /3) e xl3/l3 


a > 0 
0 < /3 < 1 


Weibull 


F(x) = e~ cxT 


c > 0 
0 < r < 1 


Log-gamma 


n 

f{x) = r(/?) (logaO* x x a 1 


a,/3> 0 


Truncated 

a-stable 


F(x) = P( \X\ > x) 

where X is an a-stable random variable 


1 < a < 2 



Table 3.2.19 Claim size distributions : “large claims”. All distributions have sup- 
port (0,oo) except for the Benktander cases and the log-gamma with (l,oo). For the 
definition of an a-stable distribution, see Embrechts et al. [29], p. 71; cf. Exercise 16 
on p. 56. 



F(x) 

iim — = c , 

“(logo;] 7 

for some constants a, c > 0 and 7 £ K. Tails of this kind are called regularly 
varying. We will come back to this notion in Section 3.2.5. 

The log-gamma, Pareto and log-normal distributions are obtained by an 
exponential transformation of a random variable with gamma, exponential 
and normal distribution, respectively. For example, let Y be N(/x,er 2 ) dis- 
tributed. Then exp{T} has the log-normal distribution with density given in 
Table 3.2.19. The goal of these exponential transformations of random vari- 
ables with a standard light-tailed distribution is to create heavy-tailed distribu- 
tions in a simple way. An advantage of this procedure is that by a logarithmic 
transformation of the data one returns to the standard light-tailed distribu- 
tions. In particular, one can use standard theory for the estimation of the 
underlying parameters. 





3.2 Claim Size Distributions 
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Some of the distributions in Table 3.2.19 were introduced as extensions of 
the Pareto, log-normal and Weibull (r < 1) distributions as classical heavy- 
tailed distributions. For example, the Burr distribution differs from the Pareto 
distribution only by the additional shape parameter r. As a matter of fact, 
practice in extreme value statistics (see for example Chapter 6 in Embrechts et 
al. [29], or convince yourself by a simulation study) shows that it is hard, if not 
impossible, to distinguish between the log-gamma, Pareto, Burr distributions 
based on parameter (for example maximum likelihood) estimation. It is indeed 
difficult to estimate the tail parameter a, the shape parameter r or the scale 
parameter k accurately in any of the cases. Similar remarks apply to the 
Benktander type I and the log-normal distributions, as well as the Benktander 
type II and the Weibull (t < 1) distributions. The Benktander distributions 
were introduced in the insurance world for one particular reason: one can 
explicitly calculate their mean excess functions; cf. Table 3.2.9. 



3.2.5 Regularly Varying Claim Sizes and Their Aggregation 

Although the distribution functions F in Table 3.2.19 look different, some of 
them are quite similar with regard to their asymptotic tail behavior. Those 
include the Pareto, Burr, stable and log-gamma distributions. In particular, 
their right tails can be written in the form 

F( x) = 1 - F(x) = , x>0, 

x a 

for some constant a > 0 and a positive measurable function L(x) on (0, oo) 
satisfying 

lim ^ = 1 for all c > 0. (3.2.14) 

a;— >oo L{ X) 

A function with this property is called slowly varying (at infinity). Examples 
of such functions are: 



constants, logarithms, powers of logarithms, iterated logarithms. 



Every slowly varying function has the representation 



L{x) = Cq(x) exp 




for x > a?o, some Xq > 0, (3.2.15) 



where e(t) — » 0 as t — » oo and co(f) is a positive function satisfying co(t) —> cq 
for some positive constant Cq. Using representation (3.2.15), one can show 
that for every 8 > 0, 



lim 

x — »oo 




= 0 



and 



lim x s L{x) = oo , 

x — »oo 



i.e., L is “small” compared to any power function, x s . 



(3.2.16) 




